The rational speech act framework

J Degen - Annual Review of Linguistics, 2023 - annualreviews.org
The past decade has seen the rapid development of a new approach to pragmatics that
attempts to integrate insights from formal and experimental semantics and pragmatics …

The challenges and prospects of brain-based prediction of behaviour

J Wu, J Li, SB Eickhoff, D Scheinost… - Nature human …, 2023 - nature.com
Relating individual brain patterns to behaviour is fundamental in system neuroscience.
Recently, the predictive modelling approach has become increasingly popular, largely due …

Self-instruct: Aligning language models with self-generated instructions

Y Wang, Y Kordi, S Mishra, A Liu, NA Smith… - arxiv preprint arxiv …, 2022 - arxiv.org
Large" instruction-tuned" language models (ie, finetuned to respond to instructions) have
demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they …

Llm-planner: Few-shot grounded planning for embodied agents with large language models

CH Song, J Wu, C Washington… - Proceedings of the …, 2023 - openaccess.thecvf.com
This study focuses on using large language models (LLMs) as a planner for embodied
agents that can follow natural language instructions to complete complex tasks in a visually …

Visual language maps for robot navigation

C Huang, O Mees, A Zeng… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Grounding language to the visual observations of a navigating agent can be performed
using off-the-shelf visual-language models pretrained on Internet-scale data (eg, image …

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

W Huang, P Abbeel, D Pathak… - … conference on machine …, 2022 - proceedings.mlr.press
Can world knowledge learned by large language models (LLMs) be used to act in
interactive environments? In this paper, we investigate the possibility of grounding high-level …

How much can clip benefit vision-and-language tasks?

S Shen, LH Li, H Tan, M Bansal, A Rohrbach… - arxiv preprint arxiv …, 2021 - arxiv.org
Most existing Vision-and-Language (V&L) models rely on pre-trained visual encoders, using
a relatively small set of manually-annotated data (as compared to web-crawled data), to …

History aware multimodal transformer for vision-and-language navigation

S Chen, PL Guhur, C Schmid… - Advances in neural …, 2021 - proceedings.neurips.cc
Vision-and-language navigation (VLN) aims to build autonomous visual agents that follow
instructions and navigate in real scenes. To remember previously visited locations and …

Navgpt: Explicit reasoning in vision-and-language navigation with large language models

G Zhou, Y Hong, Q Wu - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT
and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such …

Think global, act local: Dual-scale graph transformer for vision-and-language navigation

S Chen, PL Guhur, M Tapaswi… - Proceedings of the …, 2022 - openaccess.thecvf.com
Following language instructions to navigate in unseen environments is a challenging
problem for autonomous embodied agents. The agent not only needs to ground languages …