SPOT: SE (3) Pose Trajectory Diffusion for Object-Centric Manipulation
We introduce SPOT, an object-centric imitation learning framework. The key idea is to
capture each task by an object-centric representation, specifically the SE (3) object pose …
capture each task by an object-centric representation, specifically the SE (3) object pose …
One-shot imitation under mismatched execution
Human demonstrations as prompts are a powerful way to program robots to do long-horizon
manipulation tasks. However, translating these demonstrations into robot-executable actions …
manipulation tasks. However, translating these demonstrations into robot-executable actions …
Learning from Massive Human Videos for Universal Humanoid Pose Control
Scalable learning of humanoid robots is crucial for their deployment in real-world
applications. While traditional approaches primarily rely on reinforcement learning or …
applications. While traditional approaches primarily rely on reinforcement learning or …
Motion Before Action: Diffusing Object Motion as Manipulation Condition
Inferring object motion representations from observations enhances the performance of
robotic manipulation tasks. This paper introduces a new paradigm for robot imitation …
robotic manipulation tasks. This paper introduces a new paradigm for robot imitation …
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
The goal of this work is to generate step-by-step visual instructions in the form of a sequence
of images, given an input image that provides the scene context and the sequence of textual …
of images, given an input image that provides the scene context and the sequence of textual …
Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation Learning
Teaching robots to autonomously complete everyday tasks remains a challenge. Imitation
Learning (IL) is a powerful approach that imbues robots with skills via demonstrations, but is …
Learning (IL) is a powerful approach that imbues robots with skills via demonstrations, but is …
Zero-Shot Monocular Scene Flow Estimation in the Wild
Large models have shown generalization across datasets for many low-level vision tasks,
like depth estimation, but no such general models exist for scene flow. Even though scene …
like depth estimation, but no such general models exist for scene flow. Even though scene …
FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
We aim to develop a model-based planning framework for world models that can be scaled
with increasing model and data budgets for general-purpose manipulation tasks with only …
with increasing model and data budgets for general-purpose manipulation tasks with only …
RoboPanoptes: The All-seeing Robot with Whole-body Dexterity
We present RoboPanoptes, a capable yet practical robot system that achieves whole-body
dexterity through whole-body vision. Its whole-body dexterity allows the robot to utilize its …
dexterity through whole-body vision. Its whole-body dexterity allows the robot to utilize its …
Embodiment-Agnostic Action Planning via Object-Part Scene Flow
Observing that the key for robotic action planning is to understand the target-object motion
when its associated part is manipulated by the end effector, we propose to generate the 3D …
when its associated part is manipulated by the end effector, we propose to generate the 3D …