Neural HMMs are all you need (for high-quality attention-free TTS)

S Mehta, É Székely, J Beskow… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Neural sequence-to-sequence TTS has achieved significantly better output quality than
statistical speech synthesis using HMMs. However, neural TTS is generally not probabilistic …

OverFlow: Putting flows on top of neural transducers for better TTS

S Mehta, A Kirkland, H Lameris, J Beskow… - arxiv preprint arxiv …, 2022 - arxiv.org
Neural HMMs are a type of neural transducer recently proposed for sequence-to-sequence
modelling in text-to-speech. They combine the best features of classic statistical speech …

Time-varying Normalizing Flow for Generative Modeling of Dynamical Signals

A Ghosh, AE Fontcuberta… - 2022 30th European …, 2022 - ieeexplore.ieee.org
We develop a time-varying normalizing flow (TVNF) for explicit generative modeling of
dynamical signals. Being explicit, it can generate samples of dynamical signals, and …