Deep reinforcement learning for robotics: A survey of real-world successes

C Tang, B Abbatematteo, J Hu… - Annual Review of …, 2024 - annualreviews.org
Reinforcement learning (RL), particularly its combination with deep neural networks,
referred to as deep RL (DRL), has shown tremendous promise across a wide range of …

Spatialvlm: Endowing vision-language models with spatial reasoning capabilities

B Chen, Z Xu, S Kirmani, B Ichter… - Proceedings of the …, 2024 - openaccess.thecvf.com
Understanding and reasoning about spatial relationships is crucial for Visual Question
Answering (VQA) and robotics. Vision Language Models (VLMs) have shown impressive …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Robot learning in the era of foundation models: A survey

X **ao, J Liu, Z Wang, Y Zhou, Y Qi, Q Cheng… - arxiv preprint arxiv …, 2023 - arxiv.org
The proliferation of Large Language Models (LLMs) has s fueled a shift in robot learning
from automation towards general embodied Artificial Intelligence (AI). Adopting foundation …

Nomad: Goal masked diffusion policies for navigation and exploration

A Sridhar, D Shah, C Glossop… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Robotic learning for navigation in unfamiliar environments needs to provide policies for both
task-oriented navigation (ie, reaching a goal that the robot has located), and task-agnostic …

Habitat 3.0: A co-habitat for humans, avatars and robots

X Puig, E Undersander, A Szot, MD Cote… - arxiv preprint arxiv …, 2023 - arxiv.org
We present Habitat 3.0: a simulation platform for studying collaborative human-robot tasks in
home environments. Habitat 3.0 offers contributions across three dimensions:(1) Accurate …

Clip-fields: Weakly supervised semantic fields for robotic memory

NMM Shafiullah, C Paxton, L Pinto, S Chintala… - arxiv preprint arxiv …, 2022 - arxiv.org
We propose CLIP-Fields, an implicit scene model that can be used for a variety of tasks,
such as segmentation, instance identification, semantic search over space, and view …

When is multilinguality a curse? language modeling for 250 high-and low-resource languages

TA Chang, C Arnett, Z Tu, BK Bergen - arxiv preprint arxiv:2311.09205, 2023 - arxiv.org
Multilingual language models are widely used to extend NLP systems to low-resource
languages. However, concrete evidence for the effects of multilinguality on language …

Habitat synthetic scenes dataset (hssd-200): An analysis of 3d scene scale and realism tradeoffs for objectgoal navigation

M Khanna, Y Mao, H Jiang, S Haresh… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We contribute the Habitat Synthetic Scene Dataset a dataset of 211 high-quality 3D
scenes and use it to test navigation agent generalization to realistic 3D environments. Our …