Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling

K Nottingham, P Ammanabrolu, A Suhr… - International …, 2023 - proceedings.mlr.press
Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of
the world. However, if initialized with knowledge of high-level subgoals and transitions …

Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks

O Mees, L Hermann, E Rosete-Beas… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org
General-purpose robots coexisting with humans in their environment must learn to relate
human language to their perceptions and actions to be useful in a range of daily tasks …

Language conditioned imitation learning over unstructured data

C Lynch, P Sermanet - arxiv preprint arxiv:2005.07648, 2020 - arxiv.org
Natural language is perhaps the most flexible and intuitive way for humans to communicate
tasks to a robot. Prior work in imitation learning typically requires each task be specified with …

A survey of reinforcement learning informed by natural language

J Luketina, N Nardelli, G Farquhar, J Foerster… - arxiv preprint arxiv …, 2019 - arxiv.org
To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the
compositional, relational, and hierarchical structure of the world, and learn to transfer it to the …

Visual semantic navigation using scene priors

W Yang, X Wang, A Farhadi, A Gupta… - arxiv preprint arxiv …, 2018 - arxiv.org
How do humans navigate to target objects in novel scenes? Do we use the
semantic/functional priors we have built over years to efficiently search and navigate? For …

Learning to learn how to learn: Self-adaptive visual navigation using meta-learning

M Wortsman, K Ehsani, M Rastegari… - Proceedings of the …, 2019 - openaccess.thecvf.com
Learning is an inherently continuous phenomenon. When humans learn a new task there is
no explicit distinction between training and inference. As we learn a task, we keep learning …

Babyai: A platform to study the sample efficiency of grounded language learning

M Chevalier-Boisvert, D Bahdanau, S Lahlou… - arxiv preprint arxiv …, 2018 - arxiv.org
Allowing humans to interactively train artificial agents to understand language instructions is
desirable for both practical and scientific reasons, but given the poor data efficiency of the …

Look, listen, and act: Towards audio-visual embodied navigation

C Gan, Y Zhang, J Wu, B Gong… - … on Robotics and …, 2020 - ieeexplore.ieee.org
A crucial ability of mobile intelligent agents is to integrate the evidence from multiple sensory
inputs in an environment and to make a sequence of actions to reach their goals. In this …

Vision-language navigation: a survey and taxonomy

W Wu, T Chang, X Li, Q Yin, Y Hu - Neural Computing and Applications, 2024 - Springer
Vision-language navigation (VLN) tasks require an agent to follow language instructions
from a human guide to navigate in previously unseen environments using visual …

Learning to understand goal specifications by modelling reward

D Bahdanau, F Hill, J Leike, E Hughes… - arxiv preprint arxiv …, 2018 - arxiv.org
Recent work has shown that deep reinforcement-learning agents can learn to follow
language-like instructions from infrequent environment rewards. However, this places on …