A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights

W Lei, J Wang, F Ma, G Huang, L Liu - arxiv preprint arxiv:2407.08428, 2024 - arxiv.org
Human video generation is a dynamic and rapidly evolving task that aims to synthesize 2D
human body video sequences with generative models given control conditions such as text …

Animate-x: Universal character image animation with enhanced motion representation

S Tan, B Gong, X Wang, S Zhang, D Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Character image animation, which generates high-quality videos from a reference image
and target pose sequence, has seen significant progress in recent years. However, most …

MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis

D Qiu, Z Chen, R Wang, M Fan, C Yu, J Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in character video synthesis still depend on extensive fine-tuning or
complex 3D modeling processes, which can restrict accessibility and hinder real-time …

StableAnimator: High-Quality Identity-Preserving Human Image Animation

S Tu, Z **ng, X Han, ZQ Cheng, Q Dai, C Luo… - arxiv preprint arxiv …, 2024 - arxiv.org
Current diffusion models for human image animation struggle to ensure identity (ID)
consistency. This paper presents StableAnimator, the first end-to-end ID-preserving video …

Lingualinker: Audio-driven portraits animation with implicit facial control enhancement

R Zhang, Y Fang, Z Lu, P Cheng, Z Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
This study delves into the intricacies of synchronizing facial dynamics with multilingual audio
inputs, focusing on the creation of visually compelling, time-synchronized animations …

Human motion video generation: A survey

H Xue, X Luo, Z Hu, X Zhang, X **ang, Y Dai, J Liu… - Authorea …, 2024 - techrxiv.org
Human motion video generation has garnered significant research interest due to its broad
applications, enabling innovations such as photorealistic singing heads or dynamic avatars …

EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

R Meng, X Zhang, Y Li, C Ma - arxiv preprint arxiv:2411.10061, 2024 - arxiv.org
Recent work on human animation usually involves audio, pose, or movement maps
conditions, thereby achieves vivid animation quality. However, these methods often face …

Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance

L Hu, G Wang, Z Shen, X Gao, D Meng, L Zhuo… - arxiv preprint arxiv …, 2025 - arxiv.org
Recent character image animation methods based on diffusion models, such as Animate
Anyone, have made significant progress in generating consistent and generalizable …

HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation

Q Gan, Y Ren, C Zhang, Z Ye, P **e, X Yin… - arxiv preprint arxiv …, 2025 - arxiv.org
Human motion video generation has advanced significantly, while existing methods still
struggle with accurately rendering detailed body parts like hands and faces, especially in …

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Y Pang, B Zhu, B Lin, M Zheng, FEH Tay… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present DreamDance, a novel method for animating human images using
only skeleton pose sequences as conditional inputs. Existing approaches struggle with …