- Academic Search

A Padmakumar, J Thomason, A Shrivastava… - Proceedings of the …, 2022 - ojs.aaai.org

Robots operating in human spaces must be able to engage in natural language interaction,
both understanding and executing instructions, and using conversation to resolve ambiguity …

Zapisz Cytuj Cytowane przez 177 Powiązane artykuły Wszystkie wersje 10 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] jair.org Full View

Core challenges in embodied vision-language planning

J Francis, N Kitamura, F Labelle, X Lu, I Navarro… - Journal of Artificial …, 2022 - jair.org

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …

Zapisz Cytuj Cytowane przez 51 Powiązane artykuły Wszystkie wersje 14 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spartqa:: A textual question answering benchmark for spatial reasoning

R Mirzaee, HR Faghihi, Q Ning… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural
language text which contains more realistic spatial phenomena not covered by prior work …

Zapisz Cytuj Cytowane przez 82 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vision-language navigation: a survey and taxonomy

W Wu, T Chang, X Li, Q Yin, Y Hu - Neural Computing and Applications, 2024 - Springer

Vision-language navigation (VLN) tasks require an agent to follow language instructions
from a human guide to navigate in previously unseen environments using visual …

Zapisz Cytuj Cytowane przez 22 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Grounding open-domain instructions to automate web support tasks

N Xu, S Masling, M Du, G Campagna, L Heck… - arxiv preprint arxiv …, 2021 - arxiv.org

Grounding natural language instructions on the web to perform previously unseen tasks
enables accessibility and automation. We introduce a task and dataset to train AI agents …

Zapisz Cytuj Cytowane przez 41 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

A meta-framework for spatiotemporal quantity extraction from text

Q Ning, B Zhou, H Wu, H Peng, C Fan… - Proceedings of the …, 2022 - aclanthology.org

News events are often associated with quantities (eg, the number of COVID-19 patients or
the number of arrests in a protest), and it is often important to extract their type, time, and …

Zapisz Cytuj Cytowane przez 13 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unifying structure reasoning and language model pre-training for complex reasoning

S Wang, Z Wei, J Xu, T Li, Z Fan - arxiv preprint arxiv:2301.08913, 2023 - arxiv.org

Recent pre-trained language models (PLMs) equipped with foundation reasoning skills
have shown remarkable performance on downstream complex tasks. However, the …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Unifying Structure Reasoning and Language Pre-Training for Complex Reasoning Tasks

S Wang, Z Wei, J Xu, T Li, Z Fan - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org

Recent pre-trained language models (PLMs) equipped with foundation reasoning skills
have shown remarkable performance on downstream complex tasks. However, the …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Into the Unknown: Generating Geospatial Descriptions for New Environments

T Paz-Argaman, J Palowitch, S Kulkarni… - arxiv preprint arxiv …, 2024 - arxiv.org

Similar to vision-and-language navigation (VLN) tasks that focus on bridging the gap
between vision and language for embodied navigation, the new Rendezvous (RVS) task …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

tagE: Enabling an Embodied Agent to Understand Human Instructions

C Sarkar, A Mitra, P Pramanick, T Nayak - arxiv preprint arxiv:2310.15605, 2023 - arxiv.org

Natural language serves as the primary mode of communication when an intelligent agent
with a physical presence engages with human beings. While a plethora of research focuses …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 4 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Arramon: A joint navigation-assembly instruction interpretation task in dynamic environments

Teach: Task-driven embodied agents that chat

Core challenges in embodied vision-language planning

Spartqa:: A textual question answering benchmark for spatial reasoning

Vision-language navigation: a survey and taxonomy

Grounding open-domain instructions to automate web support tasks

A meta-framework for spatiotemporal quantity extraction from text

Unifying structure reasoning and language model pre-training for complex reasoning

Unifying Structure Reasoning and Language Pre-Training for Complex Reasoning Tasks

Into the Unknown: Generating Geospatial Descriptions for New Environments

tagE: Enabling an Embodied Agent to Understand Human Instructions