- Academic Search

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

Lagre Referanse Sitert av 107 Beslektede artikler Alle 5 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

G Zhou, Y Hong, Q Wu - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …

Lagre Referanse Sitert av 133 Beslektede artikler Alle 6 versjoner HTML-versjon

Embodied navigation with multi-modal information: A survey from tasks to methodology

Y Wu, P Zhang, M Gu, J Zheng, X Bai - Information Fusion, 2024 - Elsevier

Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …

Lagre Referanse Sitert av 7 Beslektede artikler Alle 3 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Panogen: Text-conditioned panoramic environment generation for vision-and-language navigation

J Li, M Bansal - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc

Abstract Vision-and-Language Navigation requires the agent to follow language instructions
to navigate through 3D environments. One main challenge in Vision-and-Language …

Lagre Referanse Sitert av 48 Beslektede artikler Alle 6 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Vln bert: A recurrent vision-and-language bert for navigation

Y Hong, Q Wu, Y Qi… - Proceedings of the …, 2021 - openaccess.thecvf.com

Accuracy of many visiolinguistic tasks has benefited significantly from the application of
vision-and-language (V&L) BERT. However, its application for the task of vision-and …

Lagre Referanse Sitert av 297 Beslektede artikler Alle 6 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Room-across-room: Multilingual vision-and-language navigation with dense spatiotemporal grounding

A Ku, P Anderson, R Patel, E Ie, J Baldridge - arxiv preprint arxiv …, 2020 - arxiv.org

We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN)
dataset. RxR is multilingual (English, Hindi, and Telugu) and larger (more paths and …

Lagre Referanse Sitert av 308 Beslektede artikler Alle 4 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Towards learning a generalist model for embodied navigation

D Zheng, S Huang, L Zhao… - Proceedings of the …, 2024 - openaccess.thecvf.com

Building a generalist agent that can interact with the world is an ultimate goal for humans
thus spurring the research for embodied navigation where an agent is required to navigate …

Lagre Referanse Sitert av 31 Beslektede artikler Alle 8 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vision-and-language navigation: A survey of tasks, methods, and future directions

J Gu, E Stefani, Q Wu, J Thomason… - arxiv preprint arxiv …, 2022 - arxiv.org

A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

Lagre Referanse Sitert av 128 Beslektede artikler Alle 6 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Airbert: In-domain pretraining for vision-and-language navigation

PL Guhur, M Tapaswi, S Chen… - Proceedings of the …, 2021 - openaccess.thecvf.com

Vision-and-language navigation (VLN) aims to enable embodied agents to navigate in
realistic environments using natural language instructions. Given the scarcity of domain …

Lagre Referanse Sitert av 155 Beslektede artikler Alle 8 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

March in chat: Interactive prompting for remote embodied referring expression

Y Qiao, Y Qi, Z Yu, J Liu, Q Wu - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Many Vision-and-Language Navigation (VLN) tasks have been proposed in recent
years, from room-based to object-based and indoor to outdoor. The REVERIE (Remote …

Lagre Referanse Sitert av 32 Beslektede artikler Alle 7 versjoner HTML-versjon

Opprett varsel

Referanse

Avansert søk

Lagret i Mitt bibliotek

Sub-instruction aware vision-and-language navigation

Multimodal research in vision and language: A review of current and emerging trends

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

Embodied navigation with multi-modal information: A survey from tasks to methodology

Panogen: Text-conditioned panoramic environment generation for vision-and-language navigation

Vln bert: A recurrent vision-and-language bert for navigation

Room-across-room: Multilingual vision-and-language navigation with dense spatiotemporal grounding

Towards learning a generalist model for embodied navigation

Vision-and-language navigation: A survey of tasks, methods, and future directions

Airbert: In-domain pretraining for vision-and-language navigation

March in chat: Interactive prompting for remote embodied referring expression