Training language models to follow instructions with human feedback
Making language models bigger does not inherently make them better at following a user's
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
Embodied navigation with multi-modal information: A survey from tasks to methodology
Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …
environment. A key problem in this field is embodied navigation which understands multi …
Less is more: Generating grounded navigation instructions from landmarks
We study the automatic generation of navigation instructions from 360-degree images
captured on indoor routes. Existing generators suffer from poor visual grounding, causing …
captured on indoor routes. Existing generators suffer from poor visual grounding, causing …
Etpnav: Evolving topological planning for vision-language navigation in continuous environments
Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …
Vision-and-language navigation today and tomorrow: A survey in the era of foundation models
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …
and many approaches have emerged to advance their development. The remarkable …