Smoodi: Stylized motion diffusion model

L Zhong, Y **e, V Jampani, D Sun, H Jiang - European Conference on …, 2024 - Springer
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized
motion driven by content texts and style motion sequences. Unlike existing methods that …

Human-object interaction from human-level instructions

Z Wu, J Li, P Xu, CK Liu - arxiv preprint arxiv:2406.17840, 2024 - arxiv.org
Intelligent agents must autonomously interact with the environments to perform daily tasks
based on human-level instructions. They need a foundational understanding of the world to …

Core4d: A 4d human-object-human interaction dataset for collaborative object rearrangement

Y Liu, C Zhang, R **ng, B Tang, B Yang, L Yi - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding how humans cooperatively rearrange household objects is critical for VR/AR
and human-robot interaction. However, in-depth studies on modeling these behaviors are …

Avatargo: Zero-shot 4d human-object interaction generation and animation

Y Cao, L Pan, K Han, KYK Wong, Z Liu - arxiv preprint arxiv:2410.07164, 2024 - arxiv.org
Recent advancements in diffusion models have led to significant improvements in the
generation and animation of 4D full-body human-object interactions (HOI). Nevertheless …

Mimicking-bench: A benchmark for generalizable humanoid-scene interaction learning via human mimicking

Y Liu, B Yang, L Zhong, H Wang, L Yi - arxiv preprint arxiv:2412.17730, 2024 - arxiv.org
Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human
data is a key research challenge with significant implications for robotics and real-world …

DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

Q Wu, Z Dou, S Xu, S Shimada, C Wang, Z Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
Reconstructing 3D hand-face interactions with deformations from a single image is a
challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges …

MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations

L Xu, S Hua, Z Lin, Y Liu, F Ma, Y Yan, X **… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we tackle the problem of how to build and benchmark a large motion model
(LMM). The ultimate goal of LMM is to serve as a foundation model for versatile motion …

ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation

H Li, HX Yu, J Li, J Wu - arxiv preprint arxiv:2412.18600, 2024 - arxiv.org
Human-scene interaction (HSI) generation is crucial for applications in embodied AI, virtual
reality, and robotics. While existing methods can synthesize realistic human motions in 3D …

DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models

H Kim, S Beak, H Joo - arxiv preprint arxiv:2501.08333, 2025 - arxiv.org
Understanding the ability of humans to use objects is crucial for AI to improve daily life.
Existing studies for learning such ability focus on human-object patterns (eg, contact, spatial …

Pay Attention and Move Better: Harnessing Attention for Interactive Motion Generation and Training-free Editing

LH Chen, S Lu, W Dai, Z Dou, X Ju, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
This research delves into the problem of interactive editing of human motion generation.
Previous motion diffusion models lack explicit modeling of the word-level text-motion …