The rise and potential of large language model based agents: A survey

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

Tool learning with foundation models

Y Qin, S Hu, Y Lin, W Chen, N Ding, G Cui… - ACM Computing …, 2024 - dl.acm.org
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …

R3m: A universal visual representation for robot manipulation

S Nair, A Rajeswaran, V Kumar, C Finn… - arxiv preprint arxiv …, 2022 - arxiv.org
We study how visual representations pre-trained on diverse human video data can enable
data-efficient learning of downstream robotic manipulation tasks. Concretely, we pre-train a …

Video pretraining (vpt): Learning to act by watching unlabeled online videos

B Baker, I Akkaya, P Zhokov… - Advances in …, 2022 - proceedings.neurips.cc
Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for
training models with broad, general capabilities for text, images, and other modalities …

Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration0

A O'Neill, A Rehman, A Maddukuri… - … on Robotics and …, 2024 - ieeexplore.ieee.org
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Mimicplay: Long-horizon imitation learning by watching human play

C Wang, L Fan, J Sun, R Zhang, L Fei-Fei, D Xu… - arxiv preprint arxiv …, 2023 - arxiv.org
Imitation learning from human demonstrations is a promising paradigm for teaching robots
manipulation skills in the real world. However, learning complex long-horizon tasks often …

A survey of imitation learning: Algorithms, recent developments, and challenges

M Zare, PM Kebria, A Khosravi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In recent years, the development of robotics and artificial intelligence (AI) systems has been
nothing short of remarkable. As these systems continue to evolve, they are being utilized in …

Human-to-robot imitation in the wild

S Bahl, A Gupta, D Pathak - arxiv preprint arxiv:2207.09450, 2022 - arxiv.org
We approach the problem of learning by watching humans in the wild. While traditional
approaches in Imitation and Reinforcement Learning are promising for learning in the real …

Humanplus: Humanoid shadowing and imitation from humans

Z Fu, Q Zhao, Q Wu, G Wetzstein, C Finn - arxiv preprint arxiv:2406.10454, 2024 - arxiv.org
One of the key arguments for building robots that have similar form factors to human beings
is that we can leverage the massive human data for training. Yet, doing so has remained …