Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Aligning cyber space with physical world: A comprehensive survey on embodied ai

Y Liu, W Chen, Y Bai, X Liang, G Li, W Gao… - arxiv preprint arxiv …, 2024 - arxiv.org
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …

Manipulate-anything: Automating real-world robots using vision-language models

J Duan, W Yuan, W Pumacay, YR Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large-scale endeavors like and widespread community efforts such as Open-X-Embodiment
have contributed to growing the scale of robot demonstration data. However, there is still an …

Real-time anomaly detection and reactive planning with large language models

R Sinha, A Elhafsi, C Agia, M Foutter… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation models, eg, large language models (LLMs), trained on internet-scale data
possess zero-shot generalization capabilities that make them a promising technology …

Voicepilot: Harnessing LLMs as speech interfaces for physically assistive robots

A Padmanabha, J Yuan, J Gupta… - Proceedings of the 37th …, 2024 - dl.acm.org
Physically assistive robots present an opportunity to significantly increase the well-being
and independence of individuals with motor impairments or other forms of disability who are …

Earthgpt: A universal multi-modal large language model for multi-sensor image comprehension in remote sensing domain

W Zhang, M Cai, T Zhang, Y Zhuang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Multimodal large language models (MLLMs) have demonstrated remarkable success in
vision and visual-language tasks within the natural image domain. Owing to the significant …

[HTML][HTML] A survey of robot intelligence with large language models

H Jeong, H Lee, C Kim, S Shin - Applied Sciences, 2024 - mdpi.com
Since the emergence of ChatGPT, research on large language models (LLMs) has actively
progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited …

GR-MG: Leveraging Partially-Annotated Data Via Multi-Modal Goal-Conditioned Policy

P Li, H Wu, Y Huang, C Cheang… - IEEE Robotics and …, 2025 - ieeexplore.ieee.org
The robotics community has consistently aimed to achieve generalizable robot manipulation
with flexible natural language instructions. One primary challenge is that obtaining robot …

Towards a science exocortex

KG Yager - Digital Discovery, 2024 - pubs.rsc.org
Artificial intelligence (AI) methods are poised to revolutionize intellectual work, with
generative AI enabling automation of text analysis, text generation, and simple decision …

Unlocking Robotic Autonomy: A Survey on the Applications of Foundation Models

DS Jang, DH Cho, WC Lee, SK Ryu, B Jeong… - International Journal of …, 2024 - Springer
The advancement of foundation models, such as large language models (LLMs), vision-
language models (VLMs), diffusion models, and robotics foundation models (RFMs), has …