Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Autoregressive image generation without vector quantization

T Li, Y Tian, H Li, M Deng, K He - Advances in Neural …, 2025 - proceedings.neurips.cc
Conventional wisdom holds that autoregressive models for image generation are typically
accompanied by vector-quantized tokens. We observe that while a discrete-valued space …

ivideogpt: Interactive videogpts are scalable world models

J Wu, S Yin, N Feng, X He, D Li… - Advances in Neural …, 2025 - proceedings.neurips.cc
World models empower model-based agents to interactively explore, reason, and plan
within imagined environments for real-world decision-making. However, the high demand …

3d diffuser actor: Policy diffusion with 3d scene representations

TW Ke, N Gkanatsios, K Fragkiadaki - arxiv preprint arxiv:2402.10885, 2024 - arxiv.org
Diffusion policies are conditional diffusion models that learn robot action distributions
conditioned on the robot and environment state. They have recently shown to outperform …

Moka: Open-vocabulary robotic manipulation through mark-based visual prompting

F Liu, K Fang, P Abbeel, S Levine - First Workshop on Vision …, 2024 - openreview.net
Open-vocabulary generalization requires robotic systems to perform tasks involving complex
and diverse environments and task goals. While the recent advances in vision language …

Evaluating real-world robot manipulation policies in simulation

X Li, K Hsu, J Gu, K Pertsch, O Mees, HR Walke… - arxiv preprint arxiv …, 2024 - arxiv.org
The field of robotics has made significant advances towards generalist robot manipulation
policies. However, real-world evaluation of such policies is not scalable and faces …

Open teach: A versatile teleoperation system for robotic manipulation

A Iyer, Z Peng, Y Dai, I Guzey, S Haldar… - arxiv preprint arxiv …, 2024 - arxiv.org
Open-sourced, user-friendly tools form the bedrock of scientific advancement across
disciplines. The widespread adoption of data-driven learning has led to remarkable …