Portraitbooth: A versatile portrait model for fast identity-preserved personalization

X Peng, J Zhu, B Jiang, Y Tai, D Luo… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in personalized image generation using diffusion models have been
noteworthy. However existing methods suffer from inefficiencies due to the requirement for …

Dreamtalk: When expressive talking head generation meets diffusion probabilistic models

Y Ma, S Zhang, J Wang, X Wang, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Diffusion models have shown remarkable success in a variety of downstream generative
tasks, yet remain under-explored in the important and challenging expressive talking head …

FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio

C Xu, Y Liu, J **ng, W Wang, M Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we abstract the process of people hearing speech extracting meaningful cues
and creating various dynamically audio-consistent talking faces termed Listening and …

Texdreamer: Towards zero-shot high-fidelity 3d human texture generation

Y Liu, J Zhu, J Tang, S Zhang, J Zhang, W Cao… - … on Computer Vision, 2024 - Springer
Texturing 3D humans with semantic UV maps remains a challenge due to the difficulty of
acquiring reasonably unfolded UV. Despite recent text-to-3D advancements in supervising …

FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance Head-pose and Facial Expression Features

A Rochow, M Schwarz… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The task of face reenactment is to transfer the head motion and facial expressions from a
driving video to the appearance of a source image which may be of a different person (cross …

EarSE: Bringing Robust Speech Enhancement to COTS Headphones

D Duan, Y Chen, W Xu, T Li - Proceedings of the ACM on Interactive …, 2024 - dl.acm.org
Speech enhancement is regarded as the key to the quality of digital communication and is
gaining increasing attention in the research field of audio processing. In this paper, we …

Dream-talk: diffusion-based realistic emotional audio-driven method for single image talking face generation

C Zhang, C Wang, J Zhang, H Xu, G Song… - arxiv preprint arxiv …, 2023 - arxiv.org
The generation of emotional talking faces from a single portrait image remains a significant
challenge. The simultaneous achievement of expressive emotional talking and accurate lip …

MegActor-: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer

S Yang, H Li, J Wu, M **g, L Li, R Ji, J Liang… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have demonstrated superior performance in the field of portrait animation.
However, current approaches relied on either visual or audio modality to control character …