PhonemeVec: A Phoneme-Level Contextual Prosody Representation For Speech Synthesis
Recently, fine-grained prosody representations have emerged and attracted growing
attention to address the one-to-many problem in text-to-speech (TTS). In this paper, we …
attention to address the one-to-many problem in text-to-speech (TTS). In this paper, we …
Controllable prosody generation with partial inputs
We address the problem of human-in-the-loop control for generating prosody in the context
of text-to-speech synthesis. Controlling prosody is challenging because existing generative …
of text-to-speech synthesis. Controlling prosody is challenging because existing generative …