A survey of embodied ai: From simulators to research tasks

J Duan, S Yu, HL Tan, H Zhu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …

Move as you say interact as you can: Language-guided human motion generation with scene affordance

Z Wang, Y Chen, B Jia, P Li, J Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Despite significant advancements in text-to-motion synthesis generating language-guided
human motion within 3D environments poses substantial challenges. These challenges …

Ai2-thor: An interactive 3d environment for visual ai

E Kolve, R Mottaghi, W Han, E VanderBilt… - arxiv preprint arxiv …, 2017 - arxiv.org
We introduce The House Of inteRactions (THOR), a framework for visual AI research,
available at http://ai2thor. allenai. org. AI2-THOR consists of near photo-realistic 3D indoor …

Augmenting reinforcement learning with behavior primitives for diverse manipulation tasks

S Nasiriany, H Liu, Y Zhu - 2022 International Conference on …, 2022 - ieeexplore.ieee.org
Realistic manipulation tasks require a robot to interact with an environment with a prolonged
sequence of motor actions. While deep reinforcement learning methods have recently …

Where2act: From pixels to actions for articulated 3d objects

K Mo, LJ Guibas, M Mukadam… - Proceedings of the …, 2021 - openaccess.thecvf.com
One of the fundamental goals of visual perception is to allow agents to meaningfully interact
with their environment. In this paper, we take a step towards that long-term goal--we extract …

A survey of visual affordance recognition based on deep learning

D Chen, D Kong, J Li, S Wang… - IEEE Transactions on Big …, 2023 - ieeexplore.ieee.org
Visual affordance recognition is an important research topic in robotics, human-computer
interaction, and other computer vision tasks. In recent years, deep learning-based …

An outlook into the future of egocentric vision

C Plizzari, G Goletto, A Furnari, S Bansal… - International Journal of …, 2024 - Springer
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …

SceneFun3D: fine-grained functionality and affordance understanding in 3D scenes

A Delitzas, A Takmaz, F Tombari… - Proceedings of the …, 2024 - openaccess.thecvf.com
Existing 3D scene understanding methods are heavily focused on 3D semantic and instance
segmentation. However identifying objects and their parts only constitutes an intermediate …

Learning affordance grounding from exocentric images

H Luo, W Zhai, J Zhang, Y Cao… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Affordance grounding, a task to ground (ie, localize) action possibility region in objects,
which faces the challenge of establishing an explicit link with object parts due to the diversity …

Detecting human-object contact in images

Y Chen, SK Dwivedi, MJ Black… - Proceedings of the …, 2023 - openaccess.thecvf.com
Humans constantly contact objects to move and perform tasks. Thus, detecting human-
object contact is important for building human-centered artificial intelligence. However, there …