Spoofing and countermeasures for speaker verification: A survey
While biometric authentication has advanced significantly in recent years, evidence shows
the technology can be susceptible to malicious spoofing attacks. The research community …
the technology can be susceptible to malicious spoofing attacks. The research community …
Conventional and contemporary approaches used in text to speech synthesis: A review
N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …
like natural sounding voice from the written text, is gaining popularity in the field of speech …
A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
[PDF][PDF] Wavenet: A generative model for raw audio
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
The model is fully probabilistic and autoregressive, with the predictive distribution for each …
The model is fully probabilistic and autoregressive, with the predictive distribution for each …
One-shot voice conversion by separating speaker and content representations with instance normalization
Recently, voice conversion (VC) without parallel data has been successfully adapted to multi-
target scenario in which a single model is trained to convert the input voice to many different …
target scenario in which a single model is trained to convert the input voice to many different …
[PDF][PDF] Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech.
The quality of text-to-speech (TTS) voices built from noisy speech is compromised.
Enhancing the speech data before training has been shown to improve quality but voices …
Enhancing the speech data before training has been shown to improve quality but voices …
Statistical parametric speech synthesis using deep neural networks
Conventional approaches to statistical parametric speech synthesis typically use decision
tree-clustered context-dependent hidden Markov models (HMMs) to represent probability …
tree-clustered context-dependent hidden Markov models (HMMs) to represent probability …
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
Long short-term memory recurrent neural networks (LSTM-RNNs) have been applied to
various speech applications including acoustic modeling for statistical parametric speech …
various speech applications including acoustic modeling for statistical parametric speech …
Statistical parametric speech synthesis
This review gives a general overview of techniques used in statistical parametric speech
synthesis. One instance of these techniques, called hidden Markov model (HMM)-based …
synthesis. One instance of these techniques, called hidden Markov model (HMM)-based …
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
In this paper, we describe a novel spectral conversion method for voice conversion (VC). A
Gaussian mixture model (GMM) of the joint probability density of source and target features …
Gaussian mixture model (GMM) of the joint probability density of source and target features …