Google Acadèmic

Y Wu, P Zhang, M Gu, J Zheng, X Bai - Information Fusion, 2024 - Elsevier

Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …

Desa Cita Citat per 5 Articles relacionats Totes les 3 versions Free GPT-4

[Free GPT-4]

[PDF] aaai.org

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

G Zhou, Y Hong, Q Wu - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …

Desa Cita Citat per 121 Articles relacionats Totes les 4 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Scaling data generation in vision-and-language navigation

Z Wang, J Li, Y Hong, Y Wang, Q Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …

Desa Cita Citat per 60 Articles relacionats Totes les 6 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Gridmm: Grid memory map for vision-and-language navigation

Z Wang, X Li, J Yang, Y Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Vision-and-language navigation (VLN) enables the agent to navigate to a remote location
following the natural language instruction in 3D environments. To represent the previously …

Desa Cita Citat per 48 Articles relacionats Totes les 5 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Dreamwalker: Mental planning for continuous vision-language navigation

H Wang, W Liang, L Van Gool… - Proceedings of the …, 2023 - openaccess.thecvf.com

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …

Desa Cita Citat per 34 Articles relacionats Totes les 6 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

March in chat: Interactive prompting for remote embodied referring expression

Y Qiao, Y Qi, Z Yu, J Liu, Q Wu - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Many Vision-and-Language Navigation (VLN) tasks have been proposed in recent
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …

Desa Cita Citat per 33 Articles relacionats Totes les 6 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] arxiv.org

Navgpt-2: Unleashing navigational reasoning capability for large vision-language models

G Zhou, Y Hong, Z Wang, XE Wang, Q Wu - European Conference on …, 2024 - Springer

Capitalizing on the remarkable advancements in Large Language Models (LLMs), there is a
burgeoning initiative to harness LLMs for instruction following robotic navigation. Such a …

Desa Cita Citat per 14 Articles relacionats Totes les 8 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Bevbert: Multimodal map pre-training for language-guided navigation

D An, Y Qi, Y Li, Y Huang, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale pre-training has shown promising results on the vision-and-language
navigation (VLN) task. However, most existing pre-training methods employ discrete …

Desa Cita Citat per 48 Articles relacionats Totes les 3 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] arxiv.org

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arxiv preprint arxiv …, 2024 - arxiv.org

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

Desa Cita Citat per 13 Articles relacionats Totes les 4 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] arxiv.org

Etpnav: Evolving topological planning for vision-language navigation in continuous environments

D An, H Wang, W Wang, Z Wang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …

Desa Cita Citat per 51 Articles relacionats Totes les 6 versions Free GPT-4

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Hop+: History-enhanced and order-aware pre-training for vision-and-language navigation

Embodied navigation with multi-modal information: A survey from tasks to methodology

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

Scaling data generation in vision-and-language navigation

Gridmm: Grid memory map for vision-and-language navigation

Dreamwalker: Mental planning for continuous vision-language navigation

March in chat: Interactive prompting for remote embodied referring expression

Navgpt-2: Unleashing navigational reasoning capability for large vision-language models

Bevbert: Multimodal map pre-training for language-guided navigation

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Etpnav: Evolving topological planning for vision-language navigation in continuous environments