Conventional and contemporary approaches used in text to speech synthesis: A review
N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …
like natural sounding voice from the written text, is gaining popularity in the field of speech …
Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
Hidden Markov models (HMMs) and Gaussian mixture models (GMMs) are the two most
common types of acoustic models used in statistical parametric approaches for generating …
common types of acoustic models used in statistical parametric approaches for generating …
Deep voice 2: Multi-speaker neural text-to-speech
We introduce a technique for augmenting neural text-to-speech (TTS) with low-dimensional
trainable speaker embeddings to generate different voices from a single model. As a starting …
trainable speaker embeddings to generate different voices from a single model. As a starting …
Statistical parametric speech synthesis using deep neural networks
Conventional approaches to statistical parametric speech synthesis typically use decision
tree-clustered context-dependent hidden Markov models (HMMs) to represent probability …
tree-clustered context-dependent hidden Markov models (HMMs) to represent probability …
Visual to sound: Generating natural sound for videos in the wild
As two of the five traditional human senses (sight, hearing, taste, smell, and touch), vision
and sound are basic sources through which humans understand the world. Often correlated …
and sound are basic sources through which humans understand the world. Often correlated …
Deep voice 2: Multi-speaker neural text-to-speech
We introduce a technique for augmenting neural text-to-speech (TTS) with lowdimensional
trainable speaker embeddings to generate different voices from a single model. As a starting …
trainable speaker embeddings to generate different voices from a single model. As a starting …
Method and system for non-parametric voice conversion
I Agiomyrgiannakis - US Patent 9,183,830, 2015 - Google Patents
GIOL I5/04(2013.01) A method and system is disclosed for non-parametric speech GIOL
I5/4(2006.01) conversion. A text-to-speech (TTS) synthesis system may GIOL I3/02(2013.01) …
I5/4(2006.01) conversion. A text-to-speech (TTS) synthesis system may GIOL I3/02(2013.01) …
[PDF][PDF] WaveNet Vocoder with Limited Training Data for Voice Conversion.
This paper investigates the approaches of building WaveNet vocoders with limited training
data for voice conversion (VC). Current VC systems using statistical acoustic models always …
data for voice conversion (VC). Current VC systems using statistical acoustic models always …
Evaluation of speaker verification security and detection of HMM-based synthetic speech
In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic
speech. The SV systems are based on either the Gaussian mixture model–universal …
speech. The SV systems are based on either the Gaussian mixture model–universal …
Method and system for building text-to-speech voice from diverse recordings
(57) ABSTRACT A method and system is disclosed for building a speech database for a text-
to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions …
to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions …