NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality X Tan, J Chen, H Liu, J Cong, C Zhang, Y Liu, X Wang, Y Leng, Y Yi, L He, ... IEEE Transactions on Pattern Analysis and Machine Intelligence 46 (6), 4234-4245, 2024 | 229 | 2024 |
Hifisinger: Towards high-fidelity neural singing voice synthesis J Chen, X Tan, J Luan, G Wen, T Qin, TY Liu arXiv preprint arXiv:2009.01776, 2020 | 99 | 2020 |
Seed-tts: A family of high-quality versatile speech generation models P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ... arXiv preprint arXiv:2406.02430, 2024 | 58 | 2024 |
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis Y Leng, Z Chen, J Guo, H Liu, J Chen, X Tan, D Mandic, L He, XY Li, ... Advances in Neural Information Processing Systems (NeurIPS), 2022, 2022 | 53 | 2022 |
Resgrad: Residual denoising diffusion probabilistic models for text to speech Z Chen, Y Wu, Y Leng, J Chen, H Liu, X Tan, Y Cui, K Wang, L He, S Zhao, ... arXiv preprint arXiv:2212.14518, 2022 | 21 | 2022 |
Speech-t: Transducer for text to speech and beyond J Chen, X Tan, Y Leng, J Xu, G Wen, T Qin, TY Liu Advances in Neural Information Processing Systems 34, 6621-6633, 2021 | 21 | 2021 |
Sams-net: A sliced attention-based neural network for music source separation T Li, J Chen, H Hou, M Li 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 21 | 2021 |
PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription C Zhang, J Yu, LC Chang, X Tan, J Chen, T Qin, K Zhang International Society for Music Information Retrieval (ISMIR), 2022, 2021 | 19 | 2021 |