Composition of deep and spiking neural networks for very low bit rate speech coding

M Cernak, A Lazaridis, A Asaei… - IEEE/ACM Transactions …, 2016‏ - ieeexplore.ieee.org
Most current very low bit rate (VLBR) speech coding systems use hidden Markov model
(HMM) based speech recognition and synthesis techniques. This allows transmission of …

[PDF][PDF] Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN.

B Potard, MP Aylett, DA Baude, P Motlicek - INTERSPEECH, 2016‏ - isca-archive.org
This paper presents a text to speech (TTS) extension to Kaldi-a liberally licensed open
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …

Characterisation and generation of expressivity in function of speaking styles for audiobook synthesis

A Sini - 2020‏ - theses.hal.science
In this thesis, we study the expressivity of read speech with a particular type of data, which
are audiobooks. Audiobooks are audio recordings of literary works made by professionals …

Rapid development of new TTS voices by neural network adaptation

T Delić, S Suzić, M Sečujski… - 2018 17th International …, 2018‏ - ieeexplore.ieee.org
Recent development of parametric speech synthesis based on neural networks (NN) has
inspired a range of new techniques for multispeaker speech synthesis. In this paper, a very …

Introducing prosodic speaker identity for a better expressive speech synthesis control

A Sini, S Le Maguer, D Lolive… - … Conference on Speech …, 2020‏ - hal.science
To have more control over Text-to-Speech (TTS) synthesis and to improve expressivity, it is
necessary to disentangle prosodic information carried by the speaker's voice identity from …

[PDF][PDF] F0 modeling for isarn speech synthesis using deep neural networks and syllable-level feature representation.

P Janyoi, P Seresangtakul - Int. Arab J. Inf. Technol., 2020‏ - iajit.org
The generation of the fundamental frequency (F0) plays an important role in speech
synthesis, which directly influences the naturalness of synthetic speech. In conventional …

Probabilistic amplitude demodulation features in speech synthesis for improving prosody

A Lazaridis, M Cernak, PN Garner - Interspeech 2016, 2016‏ - infoscience.epfl.ch
Amplitude demodulation (AM) is a signal decomposition technique by which a signal can be
decomposed to a product of two signals, ie, a quickly varying carrier and a slowly varying …

[PDF][PDF] Segmental foreign accent

RP Ramon - 2019‏ - core.ac.uk
The speech of non-native speakers typically differs from that of native speakers, resulting in
a foreign accent (FA) which may have important consequences for communication …

[PDF][PDF] Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis

A Lazaridis, M Cernak, PE Honnet, PN Garner - 2016‏ - infoscience.epfl.ch
In our recent work, a novel speech synthesis with enhanced prosody (SSEP) system using
probabilistic amplitude demodulation (PAD) features was introduced. These features were …

Demo of Idlak Tangle, An Open Source DNN-Based Parametric Speech Synthesiser

B Potard, M Aylett, DA Braude, P Motlicek - Interspeech 2016, 2016‏ - research.ed.ac.uk
This paper presents a text to speech (TTS) extension to Kaldi-a liberally licensed open
source speech recognition system. The system, Idlak Tangle, uses recent deep neural …