Dreamtalk: When expressive talking head generation meets diffusion probabilistic models
Diffusion models have shown remarkable success in a variety of downstream generative
tasks, yet remain under-explored in the important and challenging expressive talking head …
tasks, yet remain under-explored in the important and challenging expressive talking head …
Facediffuser: Speech-driven 3d facial animation synthesis using diffusion
Speech-driven 3D facial animation synthesis has been a challenging task both in industry
and research. Recent methods mostly focus on deterministic deep learning methods …
and research. Recent methods mostly focus on deterministic deep learning methods …
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
In this paper we abstract the process of people hearing speech extracting meaningful cues
and creating various dynamically audio-consistent talking faces termed Listening and …
and creating various dynamically audio-consistent talking faces termed Listening and …
Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges
Generative deep learning techniques have invaded the public discourse recently. Despite
the advantages, the applications to disinformation are concerning as the counter-measures …
the advantages, the applications to disinformation are concerning as the counter-measures …
Anitalker: animate vivid and diverse talking faces through identity-decoupled facial motion encoding
The paper introduces AniTalker, an innovative framework designed to generate lifelike
talking faces from a single portrait. Unlike existing models that primarily focus on verbal cues …
talking faces from a single portrait. Unlike existing models that primarily focus on verbal cues …
Vasa-1: Lifelike audio-driven talking faces generated in real time
We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …
Gaia: Zero-shot talking avatar generation
Zero-shot talking avatar generation aims at synthesizing natural talking videos from speech
and a single portrait image. Previous methods have relied on domain-specific heuristics …
and a single portrait image. Previous methods have relied on domain-specific heuristics …
Make your actor talk: Generalizable and high-fidelity lip sync with motion and appearance disentanglement
We aim to edit the lip movements in talking video according to the given speech while
preserving the personal identity and visual details. The task can be decomposed into two …
preserving the personal identity and visual details. The task can be decomposed into two …