[BOOK][B] Text-to-speech synthesis using found data for low-resource languages
E Cooper - 2019 - search.proquest.com
Text-to-speech synthesis is a key component of interactive, speech-based systems.
Typically, building a high-quality voice requires collecting dozens of hours of speech from a …
Typically, building a high-quality voice requires collecting dozens of hours of speech from a …
[PDF][PDF] Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints.
Speakers appear to adopt strategies to improve speech intelligibility for interlocutors in
adverse acoustic conditions. Generated speech, whether synthetic, recorded or live, may …
adverse acoustic conditions. Generated speech, whether synthetic, recorded or live, may …
Utterance selection for optimizing intelligibility of tts voices trained on asr data
E Cooper, X Wang - Interspeech 2017, 2017 - par.nsf.gov
This paper describes experiments in training HMM-based text-to-speech (TTS) voices on
data collected for Automatic Speech Recognition (ASR) training. We compare a number of …
data collected for Automatic Speech Recognition (ASR) training. We compare a number of …
Can objective measures predict the intelligibility of modified HMM-based synthetic speech in noise?
Synthetic speech can be modified to improve intelligibility in noise. In order to perform
modifications automatically, it would be useful to have an objective measure that could …
modifications automatically, it would be useful to have an objective measure that could …
Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion
This paper describes speech intelligibility enhancement for Hidden Markov Model (HMM)
generated synthetic speech in noise. We present a method for modifying the Mel cepstral …
generated synthetic speech in noise. We present a method for modifying the Mel cepstral …
Multimodal physiological quality-of-experience assessment of text-to-speech systems
With the growing complexity of various text-to-speech systems, it is becoming more
important to understand the underlying perceptual and judgement processes that drive user …
important to understand the underlying perceptual and judgement processes that drive user …
Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise
In this paper we introduce a new cepstral coefficient extraction method based on an
intelligibility measure for speech in noise, the Glimpse Proportion measure. This new …
intelligibility measure for speech in noise, the Glimpse Proportion measure. This new …
The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives
This paper introduces the concept of an education tool that utilizes Generative Artificial
Intelligence (GenAI) to enhance storytelling for children. The system combines GenAI-driven …
Intelligence (GenAI) to enhance storytelling for children. The system combines GenAI-driven …
Emilia: a speech corpus for Argentine Spanish text to speech synthesis
HM Torres, JA Gurlekian, DA Evin… - Language Resources …, 2019 - Springer
This paper introduces Emilia, a speech corpus created to build a female voice in Spanish
spoken in Buenos Aires for the Aromo text-to-speech system. Aromo is a unit selection text …
spoken in Buenos Aires for the Aromo text-to-speech system. Aromo is a unit selection text …
Fusion of magnitude and phase-based features for objective evaluation of TTS voice
This paper analyzes the distance-based objective measures for evaluation of Text-to-
Speech (TTS) systems (which is generally used objective measures). In this paper, we …
Speech (TTS) systems (which is generally used objective measures). In this paper, we …