Real-world robot applications of foundation models: A review
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …
Language Models (VLMs), trained on extensive data, facilitate flexible application across …
[HTML][HTML] Generating meaning: active inference and the scope and limits of passive AI
Prominent accounts of sentient behavior depict brains as generative models of organismic
interaction with the world, evincing intriguing similarities with current advances in generative …
interaction with the world, evincing intriguing similarities with current advances in generative …
Rt-2: Vision-language-action models transfer web knowledge to robotic control
A Brohan, N Brown, J Carbajal, Y Chebotar… - arxiv preprint arxiv …, 2023 - arxiv.org
We study how vision-language models trained on Internet-scale data can be incorporated
directly into end-to-end robotic control to boost generalization and enable emergent …
directly into end-to-end robotic control to boost generalization and enable emergent …
Open x-embodiment: Robotic learning datasets and rt-x models
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
[HTML][HTML] Rt-2: Vision-language-action models transfer web knowledge to robotic control
We study how vision-language models trained on Internet-scale data can be incorporated
directly into end-to-end robotic control to boost generalization and enable emergent …
directly into end-to-end robotic control to boost generalization and enable emergent …
Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration0
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
Octo: An open-source generalist robot policy
Large policies pretrained on diverse robot datasets have the potential to transform robotic
learning: instead of training new policies from scratch, such generalist robot policies may be …
learning: instead of training new policies from scratch, such generalist robot policies may be …
Liv: Language-image representations and rewards for robotic control
Abstract We present Language-Image Value learning (LIV), a unified objective for vision-
language representation and reward learning from action-free videos with text annotations …
language representation and reward learning from action-free videos with text annotations …
Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking
The grand aim of having a single robot that can manipulate arbitrary objects in diverse
settings is at odds with the paucity of robotics datasets. Acquiring and growing such datasets …
settings is at odds with the paucity of robotics datasets. Acquiring and growing such datasets …
ViNT: A foundation model for visual navigation
General-purpose pre-trained models (" foundation models") have enabled practitioners to
produce generalizable solutions for individual machine learning problems with datasets that …
produce generalizable solutions for individual machine learning problems with datasets that …