Review of end-to-end speech synthesis technology based on deep learning
As an indispensable part of modern human-computer interaction system, speech synthesis
technology helps users get the output of intelligent machine more easily and intuitively, thus …
technology helps users get the output of intelligent machine more easily and intuitively, thus …
The emotional voices database: Towards controlling the emotion dimension in voice generation systems
In this paper, we present a database of emotional speech intended to be open-sourced and
used for synthesis and generation purpose. It contains data for male and female actors in …
used for synthesis and generation purpose. It contains data for male and female actors in …
Exploring transfer learning for low resource emotional tts
During the last few years, spoken language technologies have known a big improvement
thanks to Deep Learning. However Deep Learning-based algorithms require amounts of …
thanks to Deep Learning. However Deep Learning-based algorithms require amounts of …
Multi-label extreme learning machine (MLELMs) for bangla regional speech recognition
Extensive research has been conducted in the past to determine age, gender, and words
spoken in Bangla speech, but no work has been conducted to identify the regional language …
spoken in Bangla speech, but no work has been conducted to identify the regional language …
The Blizzard Challenge 2023
The Blizzard Challenge 2023 is the eighteenth edition of the text-to-speech synthesis
Blizzard Challenge. This year, two French datasets were provided to participants and two …
Blizzard Challenge. This year, two French datasets were provided to participants and two …
Learning and controlling the source-filter representation of speech with a variational autoencoder
Understanding and controlling latent representations in deep generative models is a
challenging yet important problem for analyzing, transforming and generating various types …
challenging yet important problem for analyzing, transforming and generating various types …
A methodology for controlling the emotional expressiveness in synthetic speech-a deep learning approach
N Tits - 2019 8th International Conference on Affective …, 2019 - ieeexplore.ieee.org
In this project, we aim to build a Text-to-Speech system able to produce speech with a
controllable emotional expressiveness. We propose a methodology for solving this problem …
controllable emotional expressiveness. We propose a methodology for solving this problem …
Local Style Tokens: Fine-Grained Prosodic Representations For TTS Expressive Control
Neural Text-To-Speech (TTS) models achieve great performances regarding naturalness,
but modeling expressivity remains an ongoing challenge. Some success was found through …
but modeling expressivity remains an ongoing challenge. Some success was found through …
FastLips: an End-to-End Audiovisual Text-to-Speech System with Lip Features Prediction for Virtual Avatars
In this paper, we introduce FastLips, an end-to-end neural model designed to generate
speech and co-verbal facial movements from text, animating a virtual avatar. Based on the …
speech and co-verbal facial movements from text, animating a virtual avatar. Based on the …
Impact of Segmentation and Annotation in French end-to-end Synthesis
Audio books are commonly used to train text-to-speech models (TTS), as they offer large
phonetic content with rather expressive pronunciation, but number and sizes of publicly …
phonetic content with rather expressive pronunciation, but number and sizes of publicly …