The metaverse: Survey, trends, novel pipeline ecosystem & future directions

H Sami, A Hammoud, M Arafeh… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
The Metaverse offers a second world beyond reality, where boundaries are non-existent,
and possibilities are endless through engagement and immersive experiences using the …

Tidybot: Personalized robot assistance with large language models

J Wu, R Antonova, A Kan, M Lepert, A Zeng, S Song… - Autonomous …, 2023 - Springer
For a robot to personalize physical assistance effectively, it must learn user preferences that
can be generally reapplied to future scenarios. In this work, we investigate personalization of …

Habitat 2.0: Training home assistants to rearrange their habitat

A Szot, A Clegg, E Undersander… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …

🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

M Deitke, E VanderBilt, A Herrasti… - Advances in …, 2022 - proceedings.neurips.cc
Massive datasets and high-capacity models have driven many recent advancements in
computer vision and natural language understanding. This work presents a platform to …

[PDF][PDF] Vima: General robot manipulation with multimodal prompts

Y Jiang, A Gupta, Z Zhang, G Wang… - arxiv preprint …, 2022 - authors.library.caltech.edu
Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …

Behavior-1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation

C Li, R Zhang, J Wong, C Gokmen… - … on Robot Learning, 2023 - proceedings.mlr.press
We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …

Simple but effective: Clip embeddings for embodied ai

A Khandelwal, L Weihs, R Mottaghi… - Proceedings of the …, 2022 - openaccess.thecvf.com
Contrastive language image pretraining (CLIP) encoders have been shown to be beneficial
for a range of visual tasks from classification and detection to captioning and image …

Instruction-driven history-aware policies for robotic manipulations

PL Guhur, S Chen, RG Pinel… - … on Robot Learning, 2023 - proceedings.mlr.press
In human environments, robots are expected to accomplish a variety of manipulation tasks
given simple natural language instructions. Yet, robotic manipulation is extremely …

Ai2-thor: An interactive 3d environment for visual ai

E Kolve, R Mottaghi, W Han, E VanderBilt… - arxiv preprint arxiv …, 2017 - arxiv.org
We introduce The House Of inteRactions (THOR), a framework for visual AI research,
available at http://ai2thor. allenai. org. AI2-THOR consists of near photo-realistic 3D indoor …

Scaling data generation in vision-and-language navigation

Z Wang, J Li, Y Hong, Y Wang, Q Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …