Aligning cyber space with physical world: A comprehensive survey on embodied ai
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …
Embodied navigation with multi-modal information: A survey from tasks to methodology
Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …
environment. A key problem in this field is embodied navigation which understands multi …
Scaling data generation in vision-and-language navigation
Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …
demand for the diversity of traversable environments and the quantity of supervision for …
Eda: Explicit text-decoupling and dense alignment for 3d visual grounding
Abstract 3D visual grounding aims to find the object within point clouds mentioned by free-
form natural language descriptions with rich semantic cues. However, existing methods …
form natural language descriptions with rich semantic cues. However, existing methods …
Bird's-Eye-View Scene Graph for Vision-Language Navigation
Abstract Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However, current …
environments following human instructions, has shown great advances. However, current …
Gridmm: Grid memory map for vision-and-language navigation
Vision-and-language navigation (VLN) enables the agent to navigate to a remote location
following the natural language instruction in 3D environments. To represent the previously …
following the natural language instruction in 3D environments. To represent the previously …
Dreamwalker: Mental planning for continuous vision-language navigation
VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …
traversable environment to reach a distant target location, given language instructions. It …
March in chat: Interactive prompting for remote embodied referring expression
Abstract Many Vision-and-Language Navigation (VLN) tasks have been proposed in recent
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …
Panogen: Text-conditioned panoramic environment generation for vision-and-language navigation
Abstract Vision-and-Language Navigation requires the agent to follow language instructions
to navigate through 3D environments. One main challenge in Vision-and-Language …
to navigate through 3D environments. One main challenge in Vision-and-Language …
Adaptive zone-aware hierarchical planner for vision-language navigation
Abstract The task of Vision-Language Navigation (VLN) is for an embodied agent to reach
the global goal according to the instruction. Essentially, during navigation, a series of sub …
the global goal according to the instruction. Essentially, during navigation, a series of sub …