Physcene: Physically interactable 3d scene synthesis for embodied ai

Y Yang, B Jia, P Zhi, S Huang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
With recent developments in Embodied Artificial Intelligence (EAI) research there has been
a growing demand for high-quality large-scale interactive scene generation. While prior …

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

B Jia, Y Chen, H Yu, Y Wang, X Niu, T Liu, Q Li… - … on Computer Vision, 2024 - Springer
Abstract 3D vision-language (3D-VL) grounding, which aims to align language with 3D
physical environments, stands as a cornerstone in develo** embodied agents. In …

Motionlcm: Real-time controllable motion generation via latent consistency model

W Dai, LH Chen, J Wang, J Liu, B Dai… - European Conference on …, 2024 - Springer
This work introduces MotionLCM, extending controllable motion generation to a real-time
level. Existing methods for spatial-temporal control in text-conditioned motion generation …

Anyskill: Learning open-vocabulary physical skill for interactive agents

J Cui, T Liu, N Liu, Y Yang, Y Zhu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Traditional approaches in physics-based motion generation centered around imitation
learning and reward sha** often struggle to adapt to new scenarios. To tackle this …

Remos: 3d motion-conditioned reaction synthesis for two-person interactions

A Ghosh, R Dabral, V Golyanik, C Theobalt… - … on Computer Vision, 2024 - Springer
Current approaches for 3D human motion synthesis generate high-quality animations of
digital humans performing a wide variety of actions and gestures. However, a notable …

Multi-modal situated reasoning in 3d scenes

X Linghu, J Huang, X Niu, XS Ma… - Advances in Neural …, 2025 - proceedings.neurips.cc
Situation awareness is essential for understanding and reasoning about 3D scenes in
embodied AI agents. However, existing datasets and benchmarks for situated understanding …

Human-object interaction from human-level instructions

Z Wu, J Li, P Xu, CK Liu - arxiv preprint arxiv:2406.17840, 2024 - arxiv.org
Intelligent agents must autonomously interact with the environments to perform daily tasks
based on human-level instructions. They need a foundational understanding of the world to …

Core4d: A 4d human-object-human interaction dataset for collaborative object rearrangement

Y Liu, C Zhang, R **ng, B Tang, B Yang, L Yi - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding how humans cooperatively rearrange household objects is critical for VR/AR
and human-robot interaction. However, in-depth studies on modeling these behaviors are …

Contact-aware human motion generation from textual descriptions

S Ma, Q Cao, J Zhang, D Tao - arxiv preprint arxiv:2403.15709, 2024 - arxiv.org
This paper addresses the problem of generating 3D interactive human motion from text.
Given a textual description depicting the actions of different body parts in contact with static …

Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction

Z Lou, Q Cui, T Wang, Z Song… - Advances in …, 2025 - proceedings.neurips.cc
Diverse human motion prediction (HMP) is a fundamental application in computer vision that
has recently attracted considerable interest. Prior methods primarily focus on the stochastic …