PhonemeVec: A Phoneme-Level Contextual Prosody Representation For Speech Synthesis

S Wang, LP Chen, Y Ai, Y Hu, ZH Ling - ACM Transactions on Asian and …, 2025 - dl.acm.org
Recently, fine-grained prosody representations have emerged and attracted growing
attention to address the one-to-many problem in text-to-speech (TTS). In this paper, we …

Controllable prosody generation with partial inputs

DA Iliescu, DSR Mohan, TH Teh… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
We address the problem of human-in-the-loop control for generating prosody in the context
of text-to-speech synthesis. Controlling prosody is challenging because existing generative …