Aligning cyber space with physical world: A comprehensive survey on embodied ai

Y Liu, W Chen, Y Bai, X Liang, G Li, W Gao… - ar** and
challenging area in embodied AI. It is crucial for advancing next-generation intelligent robots …

A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities

Y Liu, X Cao, T Chen, Y Jiang, J You, M Wu… - arxiv preprint arxiv …, 2025 - arxiv.org
Healthcare systems worldwide face persistent challenges in efficiency, accessibility, and
personalization. Powered by modern AI technologies such as multimodal large language …

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use

Z Zhang, R Rossi, T Yu, F Dernoncourt… - arxiv preprint arxiv …, 2024 - arxiv.org
While vision-language models (VLMs) have demonstrated remarkable performance across
various tasks combining textual and visual information, they continue to struggle with fine …

Integrating reinforcement learning with foundation models for autonomous robotics: Methods and perspectives

A Moroncelli, V Soni, AA Shahid, M Maccarini… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled
datasets, exhibit powerful capabilities in understanding complex patterns and generating …

Vision-language-action model and diffusion policy switching enables dexterous control of an anthropomorphic hand

C Pan, K Junge, J Hughes - arxiv preprint arxiv:2410.14022, 2024 - arxiv.org
To advance autonomous dexterous manipulation, we propose a hybrid control method that
combines the relative advantages of a fine-tuned Vision-Language-Action (VLA) model and …

Investigating the role of instruction variety and task difficulty in robotic manipulation tasks

A Parekh, N Vitsakis, A Suglia, I Konstas - arxiv preprint arxiv:2407.03967, 2024 - arxiv.org
Evaluating the generalisation capabilities of multimodal models based solely on their
performance on out-of-distribution data fails to capture their true robustness. This work …