Drop the beat! Freestyler for Accompaniment Conditioned Rap** Voice Generation

Z Ning, S Wang, Y Jiang, J Yao, L He, S Pan… - arxiv preprint arxiv …, 2024 - arxiv.org
Rap, a prominent genre of vocal performance, remains underexplored in vocal generation.
General vocal synthesis depends on precise note and duration inputs, requiring users to …

Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models

T Wang, M Ge, C Gong, C Qiang, H Wang… - arxiv preprint arxiv …, 2025 - arxiv.org
Recently, emotional speech generation and speaker cloning have garnered significant
interest in text-to-speech (TTS). With the open-sourcing of codec language TTS models …