Interactive natural language processing

Z Wang, G Zhang, K Yang, N Shi, W Zhou… - arxiv preprint arxiv …, 2023 - arxiv.org
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arxiv preprint arxiv …, 2024 - arxiv.org
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

Open-ended instructable embodied agents with memory-augmented large language models

G Sarch, Y Wu, MJ Tarr, K Fragkiadaki - arxiv preprint arxiv:2310.15127, 2023 - arxiv.org
Pre-trained and frozen LLMs can effectively map simple scene re-arrangement instructions
to programs over a robot's visuomotor functions through appropriate few-shot example …

Graph Learning for Numeric Planning

DZ Chen, S Thiébaux - arxiv preprint arxiv:2410.24080, 2024 - arxiv.org
Graph learning is naturally well suited for use in symbolic, object-centric planning due to its
ability to exploit relational structures exhibited in planning domains and to take as input …

Egocentric planning for scalable embodied task achievement

X Liu, H Palacios, C Muise - Advances in Neural …, 2024 - proceedings.neurips.cc
Embodied agents face significant challenges when tasked with performing actions in diverse
environments, particularly in generalizing across object types and executing suitable actions …

Human–robot dialogue annotation for multi-modal common ground

C Bonial, SM Lukin, M Abrams, A Baker… - Language Resources …, 2024 - Springer
In this paper, we describe the development of symbolic representations annotated on
human–robot dialogue data to make dimensions of meaning accessible to autonomous …

MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

D Fu, B Qi, Y Gao, C Jiang, G Dong, B Zhou - arxiv preprint arxiv …, 2024 - arxiv.org
Long-term memory is significant for agents, in which insights play a crucial role. However,
the emergence of irrelevant insight and the lack of general insight can greatly undermine the …

Vlm agents generate their own memories: Distilling experience into embodied programs of thought

GH Sarch, L Jang, MJ Tarr, WW Cohen… - The Thirty-eighth …, 2024 - openreview.net
Large-scale generative language and vision-language models (LLMs and VLMs) excel in
few-shot in-context learning for decision making and instruction following. However, they …

Vlm agents generate their own memories: Distilling experience into embodied programs

G Sarch, L Jang, MJ Tarr, WW Cohen, K Marino… - arxiv preprint arxiv …, 2024 - arxiv.org
Large-scale generative language and vision-language models excel in in-context learning
for decision making. However, they require high-quality exemplar demonstrations to be …

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

G Sarch, S Somani, R Kapoor, MJ Tarr… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent research on instructable agents has used memory-augmented Large Language
Models (LLMs) as task planners, a technique that retrieves language-program examples …