Embodied navigation with multi-modal information: A survey from tasks to methodology
Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …
environment. A key problem in this field is embodied navigation which understands multi …
On Transforming Reinforcement Learning With Transformers: The Development Trajectory
Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …
significant successes in computer vision (CV). Due to their strong expression power …
Mapgpt: Map-guided prompting with adaptive path planning for vision-and-language navigation
Embodied agents equipped with GPT as their brain have exhibited extraordinary decision-
making and generalization abilities across various tasks. However, existing zero-shot agents …
making and generalization abilities across various tasks. However, existing zero-shot agents …
Etpnav: Evolving topological planning for vision-language navigation in continuous environments
Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …
Vision-and-language navigation today and tomorrow: A survey in the era of foundation models
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …
and many approaches have emerged to advance their development. The remarkable …
Navigation instruction generation with bev perception and large language models
Navigation instruction generation, which requires embodied agents to describe the
navigation routes, has been of great interest in robotics and human-computer interaction …
navigation routes, has been of great interest in robotics and human-computer interaction …
Controllable navigation instruction generation with chain of thought prompting
Instruction generation is a vital and multidisciplinary research area with broad applications.
Existing instruction generation models are limited to generating instructions in a single style …
Existing instruction generation models are limited to generating instructions in a single style …
Frequency-enhanced data augmentation for vision-and-language navigation
Abstract Vision-and-Language Navigation (VLN) is a challenging task that requires an agent
to navigate through complex environments based on natural language instructions. In …
to navigate through complex environments based on natural language instructions. In …
LLM as Copilot for Coarse-Grained Vision-and-Language Navigation
Abstract Vision-and-Language Navigation (VLN) involves guiding an agent through indoor
environments using human-provided textual instructions. Coarse-grained VLN, with short …
environments using human-provided textual instructions. Coarse-grained VLN, with short …
Towards learning a generalist model for embodied navigation
Building a generalist agent that can interact with the world is an ultimate goal for humans
thus spurring the research for embodied navigation where an agent is required to navigate …
thus spurring the research for embodied navigation where an agent is required to navigate …