- Academic Search

Speichern Zitieren Zitiert von: 47 Ähnliche Artikel Alle 5 Versionen HTML-Version

Bird's-Eye-View Scene Graph for Vision-Language Navigation

R Liu, X Wang, W Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However, current …

Speichern Zitieren Zitiert von: 34 Ähnliche Artikel Alle 6 Versionen HTML-Version

Dreamwalker: Mental planning for continuous vision-language navigation

H Wang, W Liang, L Van Gool… - Proceedings of the …, 2023 - openaccess.thecvf.com

VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …

Speichern Zitieren Zitiert von: 87 Ähnliche Artikel Alle 9 Versionen

Local-global context aware transformer for language-guided video segmentation

C Liang, W Wang, T Zhou, J Miao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

We explore the task of language-guided video segmentation (LVS). Previous algorithms
mostly adopt 3D CNNs to learn video representation, struggling to capture long-term context …

Speichern Zitieren Zitiert von: 48 Ähnliche Artikel Alle 3 Versionen HTML-Version

Bevbert: Multimodal map pre-training for language-guided navigation

D An, Y Qi, Y Li, Y Huang, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale pre-training has shown promising results on the vision-and-language
navigation (VLN) task. However, most existing pre-training methods employ discrete …

Speichern Zitieren Zitiert von: 50 Ähnliche Artikel Alle 6 Versionen

Etpnav: Evolving topological planning for vision-language navigation in continuous environments

D An, H Wang, W Wang, Z Wang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …

Speichern Zitieren Zitiert von: 13 Ähnliche Artikel Alle 3 Versionen HTML-Version

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arxiv preprint arxiv …, 2024 - arxiv.org

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 6 Versionen

Navigation instruction generation with bev perception and large language models

S Fan, R Liu, W Wang, Y Yang - European Conference on Computer …, 2024 - Springer

Navigation instruction generation, which requires embodied agents to describe the
navigation routes, has been of great interest in robotics and human-computer interaction …

Speichern Zitieren Zitiert von: 15 Ähnliche Artikel Alle 3 Versionen HTML-Version

Uncovering What Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

H Du, S Zhang, B **e, G Nan, J Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video anomaly understanding (VAU) aims to automatically comprehend unusual
occurrences in videos thereby enabling various applications such as traffic surveillance and …