Audio-driven facial animation by joint end-to-end learning of pose and emotion
We present a machine learning technique for driving 3D facial animation by audio input in
real time and with low latency. Our deep neural network learns a map** from input …
real time and with low latency. Our deep neural network learns a map** from input …
Audiovisual speech synthesis: An overview of the state-of-the-art
W Mattheyses, W Verhelst - Speech Communication, 2015 - Elsevier
We live in a world where there are countless interactions with computer systems in every-
day situations. In the most ideal case, this interaction feels as familiar and as natural as the …
day situations. In the most ideal case, this interaction feels as familiar and as natural as the …
Expressive visual text-to-speech using active appearance models
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is
capable of producing expressive output, in the form of a'talking head', given an input text and …
capable of producing expressive output, in the form of a'talking head', given an input text and …
An image-based visual speech animation system
An image-based visual speech animation system is presented in this paper. A video model
is proposed to preserve the video dynamics of a talking face. The model represents a video …
is proposed to preserve the video dynamics of a talking face. The model represents a video …
Comprehensive many-to-many phoneme-to-viseme map** and its application for concatenative visual speech synthesis
The use of visemes as atomic speech units in visual speech analysis and synthesis systems
is well-established. Viseme labels are determined using a many-to-one phoneme-to-viseme …
is well-established. Viseme labels are determined using a many-to-one phoneme-to-viseme …
[PDF][PDF] Photo-realistic expressive text to talking head synthesis.
A controllable computer animated avatar that could be used as a natural user interface for
computers is demonstrated. Driven by text and emotion input, it generates expressive …
computers is demonstrated. Driven by text and emotion input, it generates expressive …
[KİTAP][B] Audiovisual speech processing
When we speak, we configure the vocal tract which shapes the visible motions of the face
and the patterning of the audible speech acoustics. Similarly, we use these visible and …
and the patterning of the audible speech acoustics. Similarly, we use these visible and …
Video-realistic expressive audio-visual speech synthesis for the Greek language
High quality expressive speech synthesis has been a long-standing goal towards natural
human-computer interaction. Generating a talking head which is both realistic and …
human-computer interaction. Generating a talking head which is both realistic and …
Relating objective and subjective performance measures for aam-based visual speech synthesis
We compare two approaches for synthesizing visual speech using active appearance
models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic …
models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic …
[PDF][PDF] Speaker-adaptive visual speech synthesis in the HMM-framework.
In this paper we apply speaker-adaptive and speakerdependent training of hidden Markov
models (HMMs) to visual speech synthesis. In speaker-dependent training we use data from …
models (HMMs) to visual speech synthesis. In speaker-dependent training we use data from …