Mutex: Learning unified policies from multimodal task specifications
Humans use different modalities, such as speech, text, images, videos, etc., to communicate
their intent and goals with teammates. For robots to become better assistants, we aim to …
their intent and goals with teammates. For robots to become better assistants, we aim to …
Quest: Self-supervised skill abstractions for learning continuous control
Generalization capabilities, or rather a lack thereof, is one of the most important unsolved
problems in the field of robot learning, and while several large scale efforts have set out to …
problems in the field of robot learning, and while several large scale efforts have set out to …
Learning generalizable manipulation policies with object-centric 3d representations
We introduce GROOT, an imitation learning method for learning robust policies with object-
centric and 3D priors. GROOT builds policies that generalize beyond their initial training …
centric and 3D priors. GROOT builds policies that generalize beyond their initial training …
Bootstap: Bootstrapped training for tracking-any-point
To endow models with greater understanding of physics and motion, it is useful to enable
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …
Robot utility models: General policies for zero-shot deployment in new environments
Robot models, particularly those trained with large amounts of data, have recently shown a
plethora of real-world manipulation and navigation capabilities. Several independent efforts …
plethora of real-world manipulation and navigation capabilities. Several independent efforts …
Humanoid locomotion and manipulation: Current progress and challenges in control, planning, and learning
Humanoid robots have great potential to perform various human-level skills. These skills
involve locomotion, manipulation, and cognitive capabilities. Driven by advances in machine …
involve locomotion, manipulation, and cognitive capabilities. Driven by advances in machine …
Lotus: Continual imitation learning for robot manipulation through unsupervised skill discovery
We introduce LOTUS, a continual imitation learning algorithm that empowers a physical
robot to continuously and efficiently learn to solve new manipulation tasks throughout its …
robot to continuously and efficiently learn to solve new manipulation tasks throughout its …
Deep generative models in robotics: A survey on learning from multimodal demonstrations
Learning from Demonstrations, the field that proposes to learn robot behavior models from
data, is gaining popularity with the emergence of deep generative models. Although the …
data, is gaining popularity with the emergence of deep generative models. Although the …
Fast: Efficient action tokenization for vision-language-action models
Autoregressive sequence models, such as Transformer-based vision-language action (VLA)
policies, can be tremendously effective for capturing complex and generalizable robotic …
policies, can be tremendously effective for capturing complex and generalizable robotic …
DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control
Imitation learning has proven to be a powerful tool for training complex visuo-motor policies.
However, current methods often require hundreds to thousands of expert demonstrations to …
However, current methods often require hundreds to thousands of expert demonstrations to …