Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Imperative learning: A self-supervised neural-symbolic learning framework for robot autonomy

C Wang, K Ji, J Geng, Z Ren, T Fu, F Yang… - ar**
J Sun, P Mao, L Kong, J Wang - Sensors (Basel, Switzerland), 2025 - pmc.ncbi.nlm.nih.gov
Pre-trained models trained with internet-scale data have achieved significant improvements
in perception, interaction, and reasoning. Using them as the basis of embodied gras** …

SAT: Spatial Aptitude Training for Multimodal Language Models

A Ray, J Duan, R Tan, D Bashkirova, R Hendrix… - arxiv preprint arxiv …, 2024 - arxiv.org
Spatial perception is a fundamental component of intelligence. While many studies highlight
that large multimodal language models (MLMs) struggle to reason about space, they only …

BiFold: Bimanual Cloth Folding with Language Guidance

O Barbany, A Colomé, C Torras - arxiv preprint arxiv:2501.16458, 2025 - arxiv.org
Cloth folding is a complex task due to the inevitable self-occlusions of clothes, their
complicated dynamics, and the disparate materials, geometries, and textures that garments …

[PDF][PDF] The One RING: a Robotic Indoor Navigation Generalist

A Eftekhar, L Weihs, R Hendrix… - arxiv preprint arxiv …, 2024 - one-ring-policy.allen.ai
Modern robots vary significantly in shape, size, and sensor configurations used to perceive
and interact with their environments. However, most navigation policies are embodiment …