A survey of embodied ai: From simulators to research tasks

J Duan, S Yu, HL Tan, H Zhu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …

Where are we in the search for an artificial visual cortex for embodied intelligence?

A Majumdar, K Yadav, S Arnaud, J Ma… - Advances in …, 2023 - proceedings.neurips.cc
We present the largest and most comprehensive empirical study of pre-trained visual
representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …

Esc: Exploration with soft commonsense constraints for zero-shot object navigation

K Zhou, K Zheng, C Pryor, Y Shen… - International …, 2023 - proceedings.mlr.press
The ability to accurately locate and navigate to a specific object is a crucial capability for
embodied agents that operate in the real world and interact with objects to complete tasks …

Habitat 2.0: Training home assistants to rearrange their habitat

A Szot, A Clegg, E Undersander… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …

Navigating to objects in the real world

T Gervet, S Chintala, D Batra, J Malik, DS Chaplot - Science Robotics, 2023 - science.org
Semantic navigation is necessary to deploy mobile robots in uncontrolled environments
such as homes or hospitals. Many learning-based approaches have been proposed in …

Habitat-matterport 3d dataset (hm3d): 1000 large-scale 3d environments for embodied ai

SK Ramakrishnan, A Gokaslan, E Wijmans… - arxiv preprint arxiv …, 2021 - arxiv.org
We present the Habitat-Matterport 3D (HM3D) dataset. HM3D is a large-scale dataset of
1,000 building-scale 3D reconstructions from a diverse set of real-world locations. Each …

Embodied navigation with multi-modal information: A survey from tasks to methodology

Y Wu, P Zhang, M Gu, J Zheng, X Bai - Information Fusion, 2024 - Elsevier
Embodied AI aims to create agents that complete complex tasks by interacting with the
environment. A key problem in this field is embodied navigation which understands multi …

Scaling data generation in vision-and-language navigation

Z Wang, J Li, Y Hong, Y Wang, Q Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …

Simple but effective: Clip embeddings for embodied ai

A Khandelwal, L Weihs, R Mottaghi… - Proceedings of the …, 2022 - openaccess.thecvf.com
Contrastive language image pretraining (CLIP) encoders have been shown to be beneficial
for a range of visual tasks from classification and detection to captioning and image …

Zson: Zero-shot object-goal navigation using multimodal goal embeddings

A Majumdar, G Aggarwal, B Devnani… - Advances in …, 2022 - proceedings.neurips.cc
We present a scalable approach for learning open-world object-goal navigation (ObjectNav)–
the task of asking a virtual robot (agent) to find any instance of an object in an unexplored …