- Academic Search

CH Song, J Wu, C Washington… - Proceedings of the …, 2023 - openaccess.thecvf.com

This study focuses on using large language models (LLMs) as a planner for embodied
agents that can follow natural language instructions to complete complex tasks in a visually …

Save Cite Cited by 493 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Esc: Exploration with soft commonsense constraints for zero-shot object navigation

K Zhou, K Zheng, C Pryor, Y Shen… - International …, 2023 - proceedings.mlr.press

The ability to accurately locate and navigate to a specific object is a crucial capability for
embodied agents that operate in the real world and interact with objects to complete tasks …

Save Cite Cited by 97 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Vlmbench: A compositional benchmark for vision-and-language manipulation

K Zheng, X Chen, OC Jenkins… - Advances in Neural …, 2022 - proceedings.neurips.cc

Benefiting from language flexibility and compositionality, humans naturally intend to use
language to command an embodied agent for complex tasks such as navigation and object …

Save Cite Cited by 59 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arxiv preprint arxiv …, 2024 - arxiv.org

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

Save Cite Cited by 14 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Advancements and challenges in mobile robot navigation: A comprehensive review of algorithms and potential for self-learning approaches

S Al Mahmud, A Kamarulariffin, AM Ibrahim… - Journal of Intelligent & …, 2024 - Springer

Mobile robot navigation has been a very popular topic of practice among researchers since
a while. With the goal of enhancing the autonomy in mobile robot navigation, numerous …

Save Cite Cited by 2 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Scene-llm: Extending language model for 3d visual understanding and reasoning

R Fu, J Liu, X Chen, Y Nie, W **ong - arxiv preprint arxiv:2403.11401, 2024 - arxiv.org

This paper introduces Scene-LLM, a 3D-visual-language model that enhances embodied
agents' abilities in interactive 3D indoor environments by integrating the reasoning strengths …

Save Cite Cited by 34 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open-ended instructable embodied agents with memory-augmented large language models

G Sarch, Y Wu, MJ Tarr, K Fragkiadaki - arxiv preprint arxiv:2310.15127, 2023 - arxiv.org

Pre-trained and frozen LLMs can effectively map simple scene re-arrangement instructions
to programs over a robot's visuomotor functions through appropriate few-shot example …

Save Cite Cited by 28 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Plan, posture and go: Towards open-world text-to-motion generation

J Liu, W Dai, C Wang, Y Cheng, Y Tang… - arxiv preprint arxiv …, 2023 - arxiv.org

Conventional text-to-motion generation methods are usually trained on limited text-motion
pairs, making them hard to generalize to open-world scenarios. Some works use the CLIP …

Save Cite Cited by 17 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] ecva.net

Plan, Posture and Go: Towards Open-Vocabulary Text-to-Motion Generation

J Liu, W Dai, C Wang, Y Cheng, Y Tang… - European Conference on …, 2024 - Springer

Conventional text-to-motion generation methods are usually trained on limited text-motion
pairs, making them hard to generalize to open-vocabulary scenarios. Some works use the …

Save Cite Cited by 1 Related articles All 4 versions Free GPT-4 DeepSeek

To Boost Zero-Shot Generalization for Embodied Reasoning With Vision-Language Pre-Training

K Su, X Zhang, S Zhang, J Zhu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Recently, there exists an increased research interest in embodied artificial intelligence (EAI),
which involves an agent learning to perform a specific task when dynamically interacting …

Save Cite Cited by 1 Related articles All 5 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Jarvis: A neuro-symbolic commonsense reasoning framework for conversational embodied agents

Llm-planner: Few-shot grounded planning for embodied agents with large language models

Esc: Exploration with soft commonsense constraints for zero-shot object navigation

Vlmbench: A compositional benchmark for vision-and-language manipulation

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Advancements and challenges in mobile robot navigation: A comprehensive review of algorithms and potential for self-learning approaches

Scene-llm: Extending language model for 3d visual understanding and reasoning

Open-ended instructable embodied agents with memory-augmented large language models

Plan, posture and go: Towards open-world text-to-motion generation

Plan, Posture and Go: Towards Open-Vocabulary Text-to-Motion Generation

To Boost Zero-Shot Generalization for Embodied Reasoning With Vision-Language Pre-Training