Vevo: Controllable zero-shot voice imitation with self-supervised disentanglement

X Zhang, X Zhang, K Peng, Z Tang, V Manohar… - arxiv preprint arxiv …, 2025 - arxiv.org
The imitation of voice, targeted on specific speech attributes such as timbre and speaking
style, is crucial in speech generation. However, existing methods rely heavily on annotated …

Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS

TN Nguyen, S Akti, NQ Pham, A Waibel - arxiv preprint arxiv:2410.14997, 2024 - arxiv.org
Previous approaches on accent conversion (AC) mainly aimed at making non-native speech
sound more native while maintaining the original content and speaker identity. However …

Diffusion-Based Method with TTS Guidance for Foreign Accent Conversion

Q Bai, S Wang, Z Liu, M Zhang, W Rao… - 2024 IEEE 14th …, 2024 - ieeexplore.ieee.org
Accent conversion (AC) aims to alter the accent of spoken language while preserving the
original content and speaker characteristics. While any accent can be selected as a target …