- Academic Search

S Nasiriany, A Maddukuri, L Zhang, A Parikh… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In
Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate …

Gem Citer Citeret af 40 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Benchmark evaluations, applications, and challenges of large vision language models: A survey

Z Li, X Wu, H Du, H Nghiem, G Shi - arxiv preprint arxiv:2501.02189, 2025 - arxiv.org

Multimodal Vision Language Models (VLMs) have emerged as a transformative technology
at the intersection of computer vision and natural language processing, enabling machines …

Gem Citer Citeret af 5 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pushing the limits of cross-embodiment learning for manipulation and navigation

J Yang, C Glossop, A Bhorkar, D Shah… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent years in robotics and imitation learning have shown remarkable progress in training
large-scale foundation models by leveraging data across a multitude of embodiments. The …

Gem Citer Citeret af 24 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The colosseum: A benchmark for evaluating generalization for robotic manipulation

W Pumacay, I Singh, J Duan, R Krishna… - arxiv preprint arxiv …, 2024 - arxiv.org

To realize effective large-scale, real-world robotic applications, we must evaluate how well
our robot policies adapt to changes in environmental conditions. Unfortunately, a majority of …

Gem Citer Citeret af 26 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards efficient llm grounding for embodied multi-agent collaboration

Y Zhang, S Yang, C Bai, F Wu, X Li, Z Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Grounding the reasoning ability of large language models (LLMs) for embodied tasks is
challenging due to the complexity of the physical world. Especially, LLM planning for multi …

Gem Citer Citeret af 19 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Policy adaptation via language optimization: Decomposing tasks for few-shot imitation

V Myers, BC Zheng, O Mees, S Levine… - arxiv preprint arxiv …, 2024 - arxiv.org

Learned language-conditioned robot policies often struggle to effectively adapt to new real-
world tasks even when pre-trained across a diverse set of instructions. We propose a novel …

Gem Citer Citeret af 10 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Thinking in space: How multimodal large language models see, remember, and recall spaces

J Yang, S Yang, AW Gupta, R Han, L Fei-Fei… - arxiv preprint arxiv …, 2024 - arxiv.org

Humans possess the visual-spatial intelligence to remember spaces from sequential visual
observations. However, can Multimodal Large Language Models (MLLMs) trained on million …

Gem Citer Citeret af 4 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cogact: A foundational vision-language-action model for synergizing cognition and action in robotic manipulation

Q Li, Y Liang, Z Wang, L Luo, X Chen, M Liao… - arxiv preprint arxiv …, 2024 - arxiv.org

The advancement of large Vision-Language-Action (VLA) models has significantly improved
robotic manipulation in terms of language-guided task execution and generalization to …

Gem Citer Citeret af 4 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of robotic language grounding: Tradeoffs between symbols and embeddings

V Cohen, JX Liu, R Mooney, S Tellex… - arxiv preprint arxiv …, 2024 - arxiv.org

With large language models, robots can understand language more flexibly and more
capable than ever before. This survey reviews and situates recent literature into a spectrum …

Gem Citer Citeret af 6 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Anycar to anywhere: Learning universal dynamics model for agile and adaptive mobility

W **ao, H Xue, T Tao, D Kalaria, JM Dolan… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent works in the robot learning community have successfully introduced generalist
models capable of controlling various robot embodiments across a wide range of tasks, such …

Gem Citer Citeret af 5 Relaterede artikler Alle 3 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0

Robocasa: Large-scale simulation of everyday tasks for generalist robots

Benchmark evaluations, applications, and challenges of large vision language models: A survey

Pushing the limits of cross-embodiment learning for manipulation and navigation

The colosseum: A benchmark for evaluating generalization for robotic manipulation

Towards efficient llm grounding for embodied multi-agent collaboration

Policy adaptation via language optimization: Decomposing tasks for few-shot imitation

Thinking in space: How multimodal large language models see, remember, and recall spaces

Cogact: A foundational vision-language-action model for synergizing cognition and action in robotic manipulation

A survey of robotic language grounding: Tradeoffs between symbols and embeddings

Anycar to anywhere: Learning universal dynamics model for agile and adaptive mobility