A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
[HTML][HTML] Spoken Language Identification: An overview of past and present research trends
D O'Shaughnessy - Speech Communication, 2024 - Elsevier
Identification of the language used in spoken utterances is useful for multiple applications,
eg, assist in directing or automating telephone calls, or selecting which language-specific …
eg, assist in directing or automating telephone calls, or selecting which language-specific …
A novel tracking deep wavelet auto-encoder method for intelligent fault diagnosis of electric locomotive bearings
The condition monitoring of electric locomotive has attracted more and more attention due to
its significance for improving the security, reliability and automation level. In this paper, a …
its significance for improving the security, reliability and automation level. In this paper, a …
Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources
Speech synthesis has made significant strides thanks to the transition from machine learning
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …
Speech prosody enhances the neural processing of syntax
Human language relies on the correct processing of syntactic information, as it is essential
for successful communication between speakers. As an abstract level of language, syntax …
for successful communication between speakers. As an abstract level of language, syntax …
Hierarchical prosody modeling for non-autoregressive speech synthesis
Prosody modeling is an essential component in modern text-to-speech (TTS) frameworks.
By explicitly providing prosody features to the TTS model, the style of synthesized utterances …
By explicitly providing prosody features to the TTS model, the style of synthesized utterances …
Prosody-controllable spontaneous TTS with neural HMMs
Spontaneous speech has many affective and pragmatic functions that are interesting and
challenging to model in TTS. However, the presence of reduced articulation, fillers …
challenging to model in TTS. However, the presence of reduced articulation, fillers …
[HTML][HTML] Prosody and fluency of Finland Swedish as a second language: Investigating global parameters for automated speaking assessment
H Kallio, M Kautonen, M Kuronen - Speech Communication, 2023 - Elsevier
This study investigates prosody and fluency of Finland Swedish as a second language (L2).
The main objective is to investigate global measures of prosody and fluency as predictors of …
The main objective is to investigate global measures of prosody and fluency as predictors of …
[HTML][HTML] Event-related responses reflect chunk boundaries in natural speech
Chunking language has been proposed to be vital for comprehension enabling the
extraction of meaning from a continuous stream of speech. However, neurocognitive …
extraction of meaning from a continuous stream of speech. However, neurocognitive …
Intonation Units in spontaneous speech evoke a neural response
Spontaneous speech is produced in chunks called intonation units (IUs). IUs are defined by
a set of prosodic cues and presumably occur in all human languages. Recent work has …
a set of prosodic cues and presumably occur in all human languages. Recent work has …