Composition of deep and spiking neural networks for very low bit rate speech coding
Most current very low bit rate (VLBR) speech coding systems use hidden Markov model
(HMM) based speech recognition and synthesis techniques. This allows transmission of …
(HMM) based speech recognition and synthesis techniques. This allows transmission of …
[PDF][PDF] Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN.
This paper presents a text to speech (TTS) extension to Kaldi-a liberally licensed open
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …
Characterisation and generation of expressivity in function of speaking styles for audiobook synthesis
A Sini - 2020 - theses.hal.science
In this thesis, we study the expressivity of read speech with a particular type of data, which
are audiobooks. Audiobooks are audio recordings of literary works made by professionals …
are audiobooks. Audiobooks are audio recordings of literary works made by professionals …
Rapid development of new TTS voices by neural network adaptation
Recent development of parametric speech synthesis based on neural networks (NN) has
inspired a range of new techniques for multispeaker speech synthesis. In this paper, a very …
inspired a range of new techniques for multispeaker speech synthesis. In this paper, a very …
Introducing prosodic speaker identity for a better expressive speech synthesis control
To have more control over Text-to-Speech (TTS) synthesis and to improve expressivity, it is
necessary to disentangle prosodic information carried by the speaker's voice identity from …
necessary to disentangle prosodic information carried by the speaker's voice identity from …
[PDF][PDF] F0 modeling for isarn speech synthesis using deep neural networks and syllable-level feature representation.
The generation of the fundamental frequency (F0) plays an important role in speech
synthesis, which directly influences the naturalness of synthetic speech. In conventional …
synthesis, which directly influences the naturalness of synthetic speech. In conventional …
Probabilistic amplitude demodulation features in speech synthesis for improving prosody
Amplitude demodulation (AM) is a signal decomposition technique by which a signal can be
decomposed to a product of two signals, ie, a quickly varying carrier and a slowly varying …
decomposed to a product of two signals, ie, a quickly varying carrier and a slowly varying …
[PDF][PDF] Segmental foreign accent
RP Ramon - 2019 - core.ac.uk
The speech of non-native speakers typically differs from that of native speakers, resulting in
a foreign accent (FA) which may have important consequences for communication …
a foreign accent (FA) which may have important consequences for communication …
[PDF][PDF] Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis
A Lazaridis, M Cernak, PE Honnet, PN Garner - 2016 - infoscience.epfl.ch
In our recent work, a novel speech synthesis with enhanced prosody (SSEP) system using
probabilistic amplitude demodulation (PAD) features was introduced. These features were …
probabilistic amplitude demodulation (PAD) features was introduced. These features were …
Demo of Idlak Tangle, An Open Source DNN-Based Parametric Speech Synthesiser
This paper presents a text to speech (TTS) extension to Kaldi-a liberally licensed open
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …