MegActor-: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer

S Yang, H Li, J Wu, M **g, L Li, R Ji, J Liang… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have demonstrated superior performance in the field of portrait animation.
However, current approaches relied on either visual or audio modality to control character …

Human motion video generation: A survey

H Xue, X Luo, Z Hu, X Zhang, X **ang, Y Dai, J Liu… - Authorea …, 2024 - techrxiv.org
Human motion video generation has garnered significant research interest due to its broad
applications, enabling innovations such as photorealistic singing heads or dynamic avatars …

DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Z Shao, D Wang, QY Tian, YD Yang, H Meng… - arxiv preprint arxiv …, 2024 - arxiv.org
Although neural rendering has made significant advancements in creating lifelike,
animatable full-body and head avatars, incorporating detailed expressions into full-body …

LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync

C Li, C Zhang, W Xu, J **e, W Feng, B Peng… - arxiv preprint arxiv …, 2024 - arxiv.org
We present LatentSync, an end-to-end lip sync framework based on audio conditioned
latent diffusion models without any intermediate motion representation, diverging from …

[HTML][HTML] VividWav2Lip: High-Fidelity Facial Animation Generation Based on Speech-Driven Lip Synchronization

L Liu, J Wang, S Chen, Z Li - Electronics, 2024 - mdpi.com
Speech-driven lip synchronization is a crucial technology for generating realistic facial
animations, with broad application prospects in virtual reality, education, training, and other …