Smoodi: Stylized motion diffusion model

L Zhong, Y **e, V Jampani, D Sun, H Jiang - European Conference on …, 2024 - Springer
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized
motion driven by content texts and style motion sequences. Unlike existing methods that …

Ditto: Diffusion inference-time t-optimization for music generation

Z Novack, J McAuley, T Berg-Kirkpatrick… - arxiv preprint arxiv …, 2024 - arxiv.org
We propose Diffusion Inference-Time T-Optimization (DITTO), a general-purpose frame-
work for controlling pre-trained text-to-music diffusion models at inference-time via …

Iterative motion editing with natural language

P Goel, KC Wang, CK Liu, K Fatahalian - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Text-to-motion diffusion models can generate realistic animations from text prompts, but do
not support fine-grained motion editing controls. In this paper, we present a method for using …

Inference-time scaling for diffusion models beyond scaling denoising steps

N Ma, S Tong, H Jia, H Hu, YC Su, M Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org
Generative models have made significant impacts across various domains, largely due to
their ability to scale during training by increasing data, computational resources, and model …

RoHM: Robust Human Motion Reconstruction via Diffusion

S Zhang, BL Bhatnagar, Y Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We propose RoHM an approach for robust 3D human motion reconstruction from monocular
RGB (-D) videos in the presence of noise and occlusions. Most previous approaches either …

COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation

J Li, Y Yuan, D Rempe, H Zhang, P Molchanov… - … on Computer Vision, 2024 - Springer
Estimating global human motion from moving cameras is challenging due to the
entanglement of human and camera motions. To mitigate the ambiguity, existing methods …

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

G Tevet, S Raab, S Cohan, D Reda, Z Luo… - arxiv preprint arxiv …, 2024 - arxiv.org
Motion diffusion models and Reinforcement Learning (RL) based control for physics-based
simulations have complementary strengths for human motion generation. The former is …

Dart: A diffusion-based autoregressive motion model for real-time text-driven motion control

K Zhao, G Li, S Tang - arxiv preprint arxiv:2410.05260, 2024 - arxiv.org
Text-conditioned human motion generation, which allows for user interaction through natural
language, has become increasingly popular. Existing methods typically generate short …

CPoser: An Optimization-after-Parsing Approach for Text-to-Pose Generation Using Large Language Models

Y Li, B Chen, Z Ren, YX Ding, L Liu, T Shao… - ACM Transactions on …, 2024 - dl.acm.org
Text-to-pose generation is challenging due to the complexity of natural language and
human posture semantics. Utilizing large language models (LLMs) for text-to-pose …

SKEL-Betweener: a Neural Motion Rig for Interactive Motion Authoring

D Agrawal, J Buhmann, D Borer, RW Sumner… - ACM Transactions on …, 2024 - dl.acm.org
Authoring 3D motions is a laborious process that requires manipulating and coordinating
many control handles over time. Neural motion representations learned from large motion …