Machine learning for micro-and nanorobots

L Yang, J Jiang, F Ji, Y Li, KL Yung, A Ferreira… - Nature Machine …, 2024 - nature.com
Abstract Machine learning (ML) has revolutionized robotics by enhancing perception,
adaptability, decision-making and more, enabling robots to work in complex scenarios …

Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Open x-embodiment: Robotic learning datasets and rt-x models

A O'Neill, A Rehman, A Gupta, A Maddukuri… - arxiv preprint arxiv …, 2023 - arxiv.org
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …

[HTML][HTML] Rt-2: Vision-language-action models transfer web knowledge to robotic control

B Zitkovich, T Yu, S Xu, P Xu, T **ao… - … on Robot Learning, 2023 - proceedings.mlr.press
We study how vision-language models trained on Internet-scale data can be incorporated
directly into end-to-end robotic control to boost generalization and enable emergent …

Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration0

A O'Neill, A Rehman, A Maddukuri… - … on Robotics and …, 2024 - ieeexplore.ieee.org
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Photorealistic video generation with diffusion models

A Gupta, L Yu, K Sohn, X Gu, M Hahn, FF Li… - … on Computer Vision, 2024 - Springer
We present WALT, a diffusion transformer for photorealistic video generation from text
prompts. Our approach has two key design decisions. First, we use a causal encoder to …

Octo: An open-source generalist robot policy

OM Team, D Ghosh, H Walke, K Pertsch… - arxiv preprint arxiv …, 2024 - arxiv.org
Large policies pretrained on diverse robot datasets have the potential to transform robotic
learning: instead of training new policies from scratch, such generalist robot policies may be …

Bridgedata v2: A dataset for robot learning at scale

HR Walke, K Black, TZ Zhao, Q Vuong… - … on Robot Learning, 2023 - proceedings.mlr.press
We introduce BridgeData V2, a large and diverse dataset of robotic manipulation behaviors
designed to facilitate research in scalable robot learning. BridgeData V2 contains 53,896 …

Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking

H Bharadhwaj, J Vakil, M Sharma… - … on Robotics and …, 2024 - ieeexplore.ieee.org
The grand aim of having a single robot that can manipulate arbitrary objects in diverse
settings is at odds with the paucity of robotics datasets. Acquiring and growing such datasets …