MegActor-: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer

S Yang, H Li, J Wu, M **g, L Li, R Ji, J Liang… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have demonstrated superior performance in the field of portrait animation.
However, current approaches relied on either visual or audio modality to control character …

Generalizable and Animatable Gaussian Head Avatar

X Chu, T Harada - Advances in Neural Information …, 2025 - proceedings.neurips.cc
In this paper, we propose Generalizable and Animatable Gaussian head Avatar (GAGA) for
one-shot animatable head avatar reconstruction. Existing methods rely on neural radiance …

Headgap: Few-shot 3d head avatar via generalizable gaussian priors

X Zheng, C Wen, Z Li, W Zhang, Z Su, X Chang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we present a novel 3D head avatar creation approach capable of generalizing
from few-shot in-the-wild data with high-fidelity and animatable robustness. Given the …

[HTML][HTML] Facial Animation Strategies for Improved Emotional Expression in Virtual Reality

H Song, B Kwon - Electronics, 2024 - mdpi.com
The portrayal of emotions by virtual characters is crucial in virtual reality (VR)
communication. Effective communication in VR relies on a shared understanding, which is …

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

S Yang, H Li, J Wu, M **g, L Li, R Ji, J Liang… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite raw driving videos contain richer information on facial expressions than
intermediate representations such as landmarks in the field of portrait animation, they are …

[PDF][PDF] 1 Interpretability Of Multimodal Learning

T Zhang - stonezhang.com
Human-Readable SVG Generation The primary goal of my work [1] has been to transform
complex image data into Scalable Vector Graphics (SVGs) that are interpretable both to …