Multimodal image synthesis and editing: A survey and taxonomy
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …
among multimodal information plays a key role for the creation and perception of multimodal …
Generative artificial intelligence: a systematic review and applications
In recent years, the study of artificial intelligence (AI) has undergone a paradigm shift. This
has been propelled by the groundbreaking capabilities of generative models both in …
has been propelled by the groundbreaking capabilities of generative models both in …
Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
Emo: Emote portrait alive generating expressive portrait videos with audio2video diffusion model under weak conditions
In this work, we tackle the challenge of enhancing the realism and expressiveness in talking
head video generation by focusing on the dynamic and nuanced relationship between audio …
head video generation by focusing on the dynamic and nuanced relationship between audio …
Generating holistic 3d human motion from speech
This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …
Pose-controllable talking face generation by implicitly modularized audio-visual representation
While accurate lip synchronization has been achieved for arbitrary-subject audio-driven
talking face generation, the problem of how to efficiently drive the head pose remains …
talking face generation, the problem of how to efficiently drive the head pose remains …
Expressive talking head generation with granular audio-visual control
Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset
One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …
Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan
One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
Diffused heads: Diffusion models beat gans on talking-face generation
Talking face generation has historically struggled to produce head movements and natural
facial expressions without guidance from additional reference videos. Recent developments …
facial expressions without guidance from additional reference videos. Recent developments …