Alternate endings: Improving prosody for incremental neural TTS with predicted future text input

B Stephenson, T Hueber, L Girin, L Besacier - arxiv preprint arxiv …, 2021 - arxiv.org
The prosody of a spoken word is determined by its surrounding context. In incremental text-
to-speech synthesis, where the synthesizer produces an output before it has access to the …

[PDF][PDF] Self-adaptive and Incremental Machine Speech Chain

S Novitasari - 2022 - naist.repo.nii.ac.jp
In human spoken communication, speech production and perception are inseparable. It is
reflected in the human speech chain mechanism, showing that humans speak while …