The metaverse: Survey, trends, novel pipeline ecosystem & future directions
The Metaverse offers a second world beyond reality, where boundaries are non-existent,
and possibilities are endless through engagement and immersive experiences using the …
and possibilities are endless through engagement and immersive experiences using the …
Tidybot: Personalized robot assistance with large language models
For a robot to personalize physical assistance effectively, it must learn user preferences that
can be generally reapplied to future scenarios. In this work, we investigate personalization of …
can be generally reapplied to future scenarios. In this work, we investigate personalization of …
Habitat 2.0: Training home assistants to rearrange their habitat
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …
interactive 3D environments and complex physics-enabled scenarios. We make …
🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Massive datasets and high-capacity models have driven many recent advancements in
computer vision and natural language understanding. This work presents a platform to …
computer vision and natural language understanding. This work presents a platform to …
[PDF][PDF] Vima: General robot manipulation with multimodal prompts
Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …
processing, where a single general-purpose language model can be instructed to perform …
Behavior-1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation
We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …
Simple but effective: Clip embeddings for embodied ai
Contrastive language image pretraining (CLIP) encoders have been shown to be beneficial
for a range of visual tasks from classification and detection to captioning and image …
for a range of visual tasks from classification and detection to captioning and image …
Instruction-driven history-aware policies for robotic manipulations
In human environments, robots are expected to accomplish a variety of manipulation tasks
given simple natural language instructions. Yet, robotic manipulation is extremely …
given simple natural language instructions. Yet, robotic manipulation is extremely …
Ai2-thor: An interactive 3d environment for visual ai
We introduce The House Of inteRactions (THOR), a framework for visual AI research,
available at http://ai2thor. allenai. org. AI2-THOR consists of near photo-realistic 3D indoor …
available at http://ai2thor. allenai. org. AI2-THOR consists of near photo-realistic 3D indoor …
Scaling data generation in vision-and-language navigation
Recent research in language-guided visual navigation has demonstrated a significant
demand for the diversity of traversable environments and the quantity of supervision for …
demand for the diversity of traversable environments and the quantity of supervision for …