Open-television: Teleoperation with immersive active visual feedback
Teleoperation serves as a powerful method for collecting on-robot data essential for robot
learning from demonstrations. The intuitiveness and ease of use of the teleoperation system …
learning from demonstrations. The intuitiveness and ease of use of the teleoperation system …
Tinyvla: Towards fast, data-efficient vision-language-action models for robotic manipulation
Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor
control and instruction comprehension through end-to-end learning processes. However …
control and instruction comprehension through end-to-end learning processes. However …
Policy adaptation via language optimization: Decomposing tasks for few-shot imitation
Learned language-conditioned robot policies often struggle to effectively adapt to new real-
world tasks even when pre-trained across a diverse set of instructions. We propose a novel …
world tasks even when pre-trained across a diverse set of instructions. We propose a novel …
A survey on enhancing reinforcement learning in complex environments: Insights from human and llm feedback
Reinforcement learning (RL) is one of the active fields in machine learning, demonstrating
remarkable potential in tackling real-world challenges. Despite its promising prospects, this …
remarkable potential in tackling real-world challenges. Despite its promising prospects, this …
View: Visual imitation learning with waypoints
Robots can use visual imitation learning (VIL) to learn manipulation tasks from video
demonstrations. However, translating visual observations into actionable robot policies is …
demonstrations. However, translating visual observations into actionable robot policies is …
FLAIR: Feeding via Long-horizon AcquIsition of Realistic dishes
Robot-assisted feeding has the potential to improve the quality of life for individuals with
mobility limitations who are unable to feed themselves independently. However, there exists …
mobility limitations who are unable to feed themselves independently. However, there exists …
Words2contact: Identifying support contacts from verbal instructions using foundation models
This paper presents Words2Contact, a language-guided multi-contact placement pipeline
leveraging large language models and vision language models. Our method is a key …
leveraging large language models and vision language models. Our method is a key …
Vernacopter: Disambiguated natural-language-driven robot via formal specifications
It has been an ambition of many to control a robot for a complex task using natural language
(NL). The rise of large language models (LLMs) makes it closer to coming true. However, an …
(NL). The rise of large language models (LLMs) makes it closer to coming true. However, an …
The Ingredients for Robotic Diffusion Transformers
In recent years roboticists have achieved remarkable progress in solving increasingly
general tasks on dexterous robotic hardware by leveraging high capacity Transformer …
general tasks on dexterous robotic hardware by leveraging high capacity Transformer …
Autonomous interactive correction MLLM for robust robotic manipulation
The ability to reflect on and correct failures is crucial for robotic systems to interact stably with
real-life objects. Observing the generalization and reasoning capabilities of Multimodal …
real-life objects. Observing the generalization and reasoning capabilities of Multimodal …