A survey of embodied ai: From simulators to research tasks
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …
where AI algorithms and agents no longer learn from datasets of images, videos or text …
Recent advancements in end-to-end autonomous driving using deep learning: A survey
End-to-End driving is a promising paradigm as it circumvents the drawbacks associated with
modular systems, such as their overwhelming complexity and propensity for error …
modular systems, such as their overwhelming complexity and propensity for error …
UGIF-Net: An efficient fully guided information flow network for underwater image enhancement
Light traveling through water results in strong scattering across color channels, restricting
visibility in underwater images. Many cutting-edge underwater image enhancement …
visibility in underwater images. Many cutting-edge underwater image enhancement …
Open-vocabulary queryable scene representations for real world planning
Large language models (LLMs) have unlocked new capabilities of task planning from
human instructions. However, prior attempts to apply LLMs to real-world robotic tasks are …
human instructions. However, prior attempts to apply LLMs to real-world robotic tasks are …
Object goal navigation using goal-oriented semantic exploration
This work studies the problem of object goal navigation which involves navigating to an
instance of the given object category in unseen environments. End-to-end learning-based …
instance of the given object category in unseen environments. End-to-end learning-based …
Poni: Potential functions for objectgoal navigation with interaction-free learning
State-of-the-art approaches to ObjectGoal navigation (ObjectNav) rely on reinforcement
learning and typically require significant computational resources and time for learning. We …
learning and typically require significant computational resources and time for learning. We …
Reinforced cross-modal matching and self-supervised imitation learning for vision-language navigation
Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out
natural language instructions inside real 3D environments. In this paper, we study how to …
natural language instructions inside real 3D environments. In this paper, we study how to …
Ok-robot: What really matters in integrating open-knowledge models for robotics
Remarkable progress has been made in recent years in the fields of vision, language, and
robotics. We now have vision models capable of recognizing objects based on language …
robotics. We now have vision models capable of recognizing objects based on language …
Habitat-web: Learning embodied object-search strategies from human demonstrations at scale
R Ramrakhya, E Undersander… - Proceedings of the …, 2022 - openaccess.thecvf.com
We present a large-scale study of imitating human demonstrations on tasks that require a
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments
We introduce BEHAVIOR, a benchmark for embodied AI with 100 activities in simulation,
spanning a range of everyday household chores such as cleaning, maintenance, and food …
spanning a range of everyday household chores such as cleaning, maintenance, and food …