Deep encoder-decoder models for unsupervised learning of controllable speech synthesis

GE Henter, J Lorenzo-Trueba, X Wang… - arxiv preprint arxiv …, 2018 - arxiv.org
Generating versatile and appropriate synthetic speech requires control over the output
expression separate from the spoken text. Important non-textual speech variation is seldom …