Deep learning for visual speech analysis: A survey
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …
due to its wide applications, such as public security, medical treatment, military defense, and …
Pose-controllable talking face generation by implicitly modularized audio-visual representation
While accurate lip synchronization has been achieved for arbitrary-subject audio-driven
talking face generation, the problem of how to efficiently drive the head pose remains …
talking face generation, the problem of how to efficiently drive the head pose remains …
Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video
Abstract We present Non-Rigid Neural Radiance Fields (NR-NeRF), a reconstruction and
novel view synthesis approach for general non-rigid dynamic scenes. Our approach takes …
novel view synthesis approach for general non-rigid dynamic scenes. Our approach takes …
Ad-nerf: Audio driven neural radiance fields for talking head synthesis
Generating high-fidelity talking head video by fitting with the input audio sequence is a
challenging problem that receives considerable attentions recently. In this paper, we …
challenging problem that receives considerable attentions recently. In this paper, we …
Expressive talking head generation with granular audio-visual control
Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan
One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
Towards fast, accurate and stable 3d dense face alignment
Existing methods of 3D dthus limiting the scope of their practical applications. In this paper,
we propose a novel regression framework which makes a balance among speed, accuracy …
we propose a novel regression framework which makes a balance among speed, accuracy …
Eamm: One-shot emotional talking face via audio-based emotion-aware motion model
Although significant progress has been made to audio-driven talking face generation,
existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In …
existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In …
Fsgan: Subject agnostic face swap** and reenactment
Abstract We present Face Swap** GAN (FSGAN) for face swap** and reenactment.
Unlike previous work, FSGAN is subject agnostic and can be applied to pairs of faces …
Unlike previous work, FSGAN is subject agnostic and can be applied to pairs of faces …
Live speech portraits: real-time photorealistic talking-head animation
To the best of our knowledge, we first present a live system that generates personalized
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …