- Academic Search

M Xu, C Li, D Su, W Liang, D Yu - arxiv preprint arxiv:2406.04350, 2024 - arxiv.org

Audio editing involves the arbitrary manipulation of audio content through precise control.
Although text-guided diffusion models have made significant advancements in text-to-audio …

Save Cite Cited by 1 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Speech Synthesis along Perceptual Voice Quality Dimensions

F Rautenberg, M Kuhlmann, F Seebauer… - arxiv preprint arxiv …, 2025 - arxiv.org

While expressive speech synthesis or voice conversion systems mainly focus on controlling
or manipulating abstract prosodic characteristics of speech, such as emotion or accent, we …

[Free GPT-4]

[PDF] arxiv.org

Beyond the" Industry Standard": Focusing Gender-Affirming Voice Training Technologies on Individualized Goal Exploration

K Povinelli, H Zhu, Y Zhao - arxiv preprint arxiv:2410.09958, 2024 - arxiv.org

Gender-affirming voice training is critical for the transition process for many transgender
individuals, enabling their voice to align with their gender identity. Individualized voice goals …

Save Cite Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

Permod: Perceptually grounded voice modification with latent diffusion models

Prompt-guided Precise Audio Editing with Diffusion Models

Speech Synthesis along Perceptual Voice Quality Dimensions

Beyond the" Industry Standard": Focusing Gender-Affirming Voice Training Technologies on Individualized Goal Exploration