Do as i can, not as i say: Grounding language in robotic affordances
M Ahn, A Brohan, N Brown, Y Chebotar… - arxiv preprint arxiv …, 2022 - arxiv.org
Large language models can encode a wealth of semantic knowledge about the world. Such
knowledge could be extremely useful to robots aiming to act upon high-level, temporally …
knowledge could be extremely useful to robots aiming to act upon high-level, temporally …
Voyager: An open-ended embodied agent with large language models
We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft
that continuously explores the world, acquires diverse skills, and makes novel discoveries …
that continuously explores the world, acquires diverse skills, and makes novel discoveries …
Perceiver-actor: A multi-task transformer for robotic manipulation
Transformers have revolutionized vision and natural language processing with their ability to
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
Llm-planner: Few-shot grounded planning for embodied agents with large language models
This study focuses on using large language models (LLMs) as a planner for embodied
agents that can follow natural language instructions to complete complex tasks in a visually …
agents that can follow natural language instructions to complete complex tasks in a visually …
Task and motion planning with large language models for object rearrangement
Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning
is frequently needed in this process. However, achieving commonsense arrangements …
is frequently needed in this process. However, achieving commonsense arrangements …
Clip-fields: Weakly supervised semantic fields for robotic memory
We propose CLIP-Fields, an implicit scene model that can be used for a variety of tasks,
such as segmentation, instance identification, semantic search over space, and view …
such as segmentation, instance identification, semantic search over space, and view …
Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling
Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of
the world. However, if initialized with knowledge of high-level subgoals and transitions …
the world. However, if initialized with knowledge of high-level subgoals and transitions …
Esc: Exploration with soft commonsense constraints for zero-shot object navigation
The ability to accurately locate and navigate to a specific object is a crucial capability for
embodied agents that operate in the real world and interact with objects to complete tasks …
embodied agents that operate in the real world and interact with objects to complete tasks …
Film: Following instructions in language with modular methods
Recent methods for embodied instruction following are typically trained end-to-end using
imitation learning. This often requires the use of expert trajectories and low-level language …
imitation learning. This often requires the use of expert trajectories and low-level language …
Ok-robot: What really matters in integrating open-knowledge models for robotics
Remarkable progress has been made in recent years in the fields of vision, language, and
robotics. We now have vision models capable of recognizing objects based on language …
robotics. We now have vision models capable of recognizing objects based on language …