Prompt-guided Precise Audio Editing with Diffusion Models

M Xu, C Li, D Su, W Liang, D Yu - arxiv preprint arxiv:2406.04350, 2024 - arxiv.org
Audio editing involves the arbitrary manipulation of audio content through precise control.
Although text-guided diffusion models have made significant advancements in text-to-audio …

Speech Synthesis along Perceptual Voice Quality Dimensions

F Rautenberg, M Kuhlmann, F Seebauer… - arxiv preprint arxiv …, 2025 - arxiv.org
While expressive speech synthesis or voice conversion systems mainly focus on controlling
or manipulating abstract prosodic characteristics of speech, such as emotion or accent, we …

Beyond the" Industry Standard": Focusing Gender-Affirming Voice Training Technologies on Individualized Goal Exploration

K Povinelli, H Zhu, Y Zhao - arxiv preprint arxiv:2410.09958, 2024 - arxiv.org
Gender-affirming voice training is critical for the transition process for many transgender
individuals, enabling their voice to align with their gender identity. Individualized voice goals …