الباحث العلمي من Google

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022‏ - Elsevier‏

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …‏

حفظ اقتباس تم اقتباسها في عدد: 109 مقالات ذات صلة الإصدارات الـ 5كلها

[Free GPT-4]

[PDF] arxiv.org

How much can clip benefit vision-and-language tasks?‏

S Shen, LH Li, H Tan, M Bansal, A Rohrbach… - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

Most existing Vision-and-Language (V&L) models rely on pre-trained visual encoders, using
a relatively small set of manually-annotated data (as compared to web-crawled data), to …‏

حفظ اقتباس تم اقتباسها في عدد: 460 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

Embodied navigation with multi-modal information: A survey from tasks to methodology‏

Y Wu, P Zhang, M Gu, J Zheng, X Bai - Information Fusion, 2024‏ - Elsevier‏

Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …‏

حفظ اقتباس تم اقتباسها في عدد: 5 مقالات ذات صلة الإصدارات الـ 3كلها

[Free GPT-4]

[PDF] thecvf.com

Episodic transformer for vision-and-language navigation‏

A Pashevich, C Schmid, C Sun - Proceedings of the IEEE …, 2021‏ - openaccess.thecvf.com‏

Interaction and navigation defined by natural language instructions in dynamic
environments pose significant challenges for neural agents. This paper focuses on …‏

حفظ اقتباس تم اقتباسها في عدد: 208 مقالات ذات صلة الإصدارات الـ 10كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Room-across-room: Multilingual vision-and-language navigation with dense spatiotemporal grounding‏

A Ku, P Anderson, R Patel, E Ie, J Baldridge - arxiv preprint arxiv …, 2020‏ - arxiv.org‏

We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN)
dataset. RxR is multilingual (English, Hindi, and Telugu) and larger (more paths and …‏

حفظ اقتباس تم اقتباسها في عدد: 304 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]

[PDF] thecvf.com

Airbert: In-domain pretraining for vision-and-language navigation‏

PL Guhur, M Tapaswi, S Chen… - Proceedings of the …, 2021‏ - openaccess.thecvf.com‏

Vision-and-language navigation (VLN) aims to enable embodied agents to navigate in
realistic environments using natural language instructions. Given the scarcity of domain …‏

حفظ اقتباس تم اقتباسها في عدد: 157 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Vision-and-language navigation: A survey of tasks, methods, and future directions‏

J Gu, E Stefani, Q Wu, J Thomason… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …‏

حفظ اقتباس تم اقتباسها في عدد: 130 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]

[PDF] thecvf.com

Bird's-Eye-View Scene Graph for Vision-Language Navigation‏

R Liu, X Wang, W Wang… - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com‏

Abstract Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However, current …‏

حفظ اقتباس تم اقتباسها في عدد: 48 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

[Free GPT-4]

[PDF] thecvf.com

Vision-language navigation with self-supervised auxiliary reasoning tasks‏

F Zhu, Y Zhu, X Chang, X Liang - Proceedings of the IEEE …, 2020‏ - openaccess.thecvf.com‏

Abstract Vision-Language Navigation (VLN) is a task where an agent learns to navigate
following a natural language instruction. The key to this task is to perceive both the visual …‏

حفظ اقتباس تم اقتباسها في عدد: 275 مقالات ذات صلة الإصدارات الـ 13كلها إصدار HTML‏

[Free GPT-4]

[PDF] thecvf.com

Envedit: Environment editing for vision-and-language navigation‏

J Li, H Tan, M Bansal - … of the IEEE/CVF Conference on …, 2022‏ - openaccess.thecvf.com‏

Abstract In Vision-and-Language Navigation (VLN), an agent needs to navigate through the
environment based on natural language instructions. Due to limited available data for agent …‏

حفظ اقتباس تم اقتباسها في عدد: 88 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Transferable representation learning in vision-and-language navigation

Multimodal research in vision and language: A review of current and emerging trends‏

How much can clip benefit vision-and-language tasks?‏

Embodied navigation with multi-modal information: A survey from tasks to methodology‏

Episodic transformer for vision-and-language navigation‏

Room-across-room: Multilingual vision-and-language navigation with dense spatiotemporal grounding‏

Airbert: In-domain pretraining for vision-and-language navigation‏

Vision-and-language navigation: A survey of tasks, methods, and future directions‏

Bird's-Eye-View Scene Graph for Vision-Language Navigation‏

Vision-language navigation with self-supervised auxiliary reasoning tasks‏

Envedit: Environment editing for vision-and-language navigation‏