Embodied navigation with multi-modal information: A survey from tasks to methodology
Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …
environment. A key problem in this field is embodied navigation which understands multi …
Navgpt: Explicit reasoning in vision-and-language navigation with large language models
Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …
Scaling data generation in vision-and-language navigation
Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …
demand for the diversity of traversable environments and the quantity of supervision for …
Gridmm: Grid memory map for vision-and-language navigation
Vision-and-language navigation (VLN) enables the agent to navigate to a remote location
following the natural language instruction in 3D environments. To represent the previously …
following the natural language instruction in 3D environments. To represent the previously …
Dreamwalker: Mental planning for continuous vision-language navigation
VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …
traversable environment to reach a distant target location, given language instructions. It …
March in chat: Interactive prompting for remote embodied referring expression
Abstract Many Vision-and-Language Navigation (VLN) tasks have been proposed in recent
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …
Navgpt-2: Unleashing navigational reasoning capability for large vision-language models
Capitalizing on the remarkable advancements in Large Language Models (LLMs), there is a
burgeoning initiative to harness LLMs for instruction following robotic navigation. Such a …
burgeoning initiative to harness LLMs for instruction following robotic navigation. Such a …
Bevbert: Multimodal map pre-training for language-guided navigation
Large-scale pre-training has shown promising results on the vision-and-language
navigation (VLN) task. However, most existing pre-training methods employ discrete …
navigation (VLN) task. However, most existing pre-training methods employ discrete …
Vision-and-language navigation today and tomorrow: A survey in the era of foundation models
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …
and many approaches have emerged to advance their development. The remarkable …
Etpnav: Evolving topological planning for vision-language navigation in continuous environments
Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …