Seuraa
Chenpeng Du
Chenpeng Du
ByteDance
Vahvistettu sähköpostiosoite verkkotunnuksessa bytedance.com - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
VQTTS: High-fidelity text-to-speech synthesis with self-supervised VQ acoustic feature
C Du, Y Guo, X Chen, K Yu
Interspeech 2022, 1596-1600, 2022
692022
Unicats: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding
C Du, Y Guo, F Shen, Z Liu, Z Liang, X Chen, S Wang, H Zhang, K Yu
Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 17924 …, 2024
472024
Speaker augmentation for low resource speech recognition
C Du, K Yu
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
452020
Dae-talker: High fidelity speech-driven talking face generation with diffusion autoencoder
C Du, Q Chen, T He, X Tan, X Chen, K Yu, S Zhao, J Bian
Proceedings of the 31st ACM International Conference on Multimedia, 4281-4289, 2023
432023
Emodiff: Intensity controllable emotional text-to-speech with soft-label guidance
Y Guo, C Du, X Chen, K Yu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
362023
Voiceflow: Efficient text-to-speech with rectified flow matching
Y Guo, C Du, Z Ma, X Chen, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
332024
Data augmentation for end-to-end code-switching speech recognition
C Du, H Li, Y Lu, L Wang, Y Qian
2021 IEEE Spoken Language Technology Workshop (SLT), 194-200, 2021
302021
Rich prosody diversity modelling with phone-level mixture density network
C Du, K Yu
Interspeech 2021, 3136-3140, 2021
29*2021
Towards universal speech discrete tokens: A case study for asr and tts
Y Yang, F Shen, C Du, Z Ma, K Yu, D Povey, X Chen
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
272024
Phone-level prosody modelling with GMM-based MDN for diverse and controllable speech synthesis
C Du, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 190-201, 2021
23*2021
Vall-t: Decoder-only generative transducer for robust and decoding-controllable text-to-speech
C Du, Y Guo, H Wang, Y Yang, Z Niu, S Wang, H Zhang, X Chen, K Yu
arXiv preprint arXiv:2401.14321, 2024
202024
Towards data selection on tts data for children’s speech recognition
W Wang, Z Zhou, Y Lu, H Wang, C Du, Y Qian
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
172021
Language Model Can Listen While Speaking
Z Ma, Y Song, C Du, J Cong, Z Chen, Y Wang, Y Wang, X Chen
arXiv preprint arXiv:2408.02622, 2024
162024
Unsupervised word-level prosody tagging for controllable speech synthesis
Y Guo, C Du, K Yu
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
142022
Acoustic bpe for speech generation with discrete tokens
F Shen, Y Guo, C Du, X Chen, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
102024
Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature
C Du, Y Guo, X Chen, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 3446-3456, 2023
102023
Synaug: Synthesis-based data augmentation for text-dependent speaker verification
C Du, B Han, S Wang, Y Qian, K Yu
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
102021
DSE-TTS: dual speaker embedding for cross-lingual text-to-speech
S Liu, Y Guo, C Du, X Chen, K Yu
Interspeech 2023, 616-620, 2023
82023
Anitalker: animate vivid and diverse talking faces through identity-decoupled facial motion encoding
T Liu, F Chen, S Fan, C Du, Q Chen, X Chen, K Yu
Proceedings of the 32nd ACM International Conference on Multimedia, 6696-6705, 2024
72024
Neural fusion for voice cloning
B Chen, C Du, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1993-2001, 2022
72022
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20