Follow
Yuancheng Wang
Yuancheng Wang
The Chinese University of Hong Kong, Shenzhen
Verified email at link.cuhk.edu.cn - Homepage
Title
Cited by
Cited by
Year
Naturalspeech 3: Zero-shot speech synthesis with factorized codec and diffusion models
Z Ju*, Y Wang*, K Shen*, X Tan*, D Xin, D Yang, Y Liu, Y Leng, K Song, ...
ICML 2024, 2024
1242024
Audit: Audio editing by following instructions with latent diffusion models
Y Wang, Z Ju, X Tan, L He, Z Wu, J Bian
NeurIPS 2023, 2023
522023
Automated testing of image captioning systems
B Yu, Z Zhong, X Qin, J Yao, Y Wang, P He
Proceedings of the 31st ACM SIGSOFT International Symposium on Software …, 2022
272022
Amphion: An open-source audio, music and speech generation toolkit
X Zhang*, L Xue*, Y Gu*, Y Wang*, J Li*, H He, C Wang, S Liu, X Chen, ...
IEEE Spoken Language Technology Workshop (SLT 2024), 2023
252023
Rall-e: Robust codec language modeling with chain-of-thought prompting for text-to-speech synthesis
D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ...
arXiv preprint arXiv:2404.03204, 2024
232024
Foleycrafter: Bring silent videos to life with lifelike and synchronized sounds
Y Zhang, Y Gu, Y Zeng, Z Xing, Y Wang, Z Wu, K Chen
arXiv preprint arXiv:2407.01494, 2024
202024
Emilia: An extensive, multilingual, and diverse speech dataset for large-scale speech generation
H He, Z Shang, C Wang, X Li, Y Gu, H Hua, L Liu, C Yang, J Li, P Shi, ...
IEEE Spoken Language Technology Workshop (SLT 2024), 2024
192024
Maskgct: Zero-shot text-to-speech with masked generative codec transformer
Y Wang, H Zhan, L Liu, R Zeng, H Guo, J Zheng, Q Zhang, X Zhang, ...
ICLR 2025, 2024
122024
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
J Ao*, Y Wang*, X Tian, D Chen, J Zhang, L Lu, Y Wang, H Li, Z Wu
NeurIPS 2024, 2024
92024
Debatts: Zero-shot debating text-to-speech synthesis
Y Huang, Y Wang, J Li, H Guo, H He, S Zhang, Z Wu
arXiv preprint arXiv:2411.06540, 2024
22024
Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
H He, Y Song, Y Wang, H Li, X Zhang, L Wang, G Huang, ES Chng, Z Wu
arXiv preprint arXiv:2411.19770, 2024
12024
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
H He, Z Shang, C Wang, X Li, Y Gu, H Hua, L Liu, C Yang, J Li, P Shi, ...
arXiv preprint arXiv:2501.15907, 2025
2025
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
J Zhang, J Yang, Z Fang, Y Wang, Z Zhang, Z Wang, F Fan, Z Wu
arXiv preprint arXiv:2501.15417, 2025
2025
Overview of the Amphion Toolkit (v0. 2)
J Li, X Zhang, Y Wang, H He, C Wang, L Wang, H Liao, J Ao, Z Xie, ...
arXiv preprint arXiv:2501.15442, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–14