Naturalspeech: End-to-end text-to-speech synthesis with human-level quality X Tan, J Chen, H Liu, J Cong, C Zhang, Y Liu, X Wang, Y Leng, Y Yi, L He, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 | 221 | 2024 |
Hifisinger: Towards high-fidelity neural singing voice synthesis J Chen, X Tan, J Luan, G Wen, T Qin, TY Liu arXiv preprint arXiv:2009.01776, 2020 | 98 | 2020 |
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis Y Leng, Z Chen, J Guo, H Liu, J Chen, X Tan, D Mandic, L He, XY Li, ... Advances in Neural Information Processing Systems (NeurIPS), 2022, 2022 | 54 | 2022 |
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ... arXiv preprint arXiv:2406.02430, 2024 | 49 | 2024 |
Sams-net: A sliced attention-based neural network for music source separation T Li, J Chen, H Hou, M Li 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 23 | 2021 |
Speech-t: Transducer for text to speech and beyond J Chen, X Tan, Y Leng, J Xu, G Wen, T Qin, TY Liu Advances in Neural Information Processing Systems 34, 6621-6633, 2021 | 21 | 2021 |
Resgrad: Residual denoising diffusion probabilistic models for text to speech Z Chen, Y Wu, Y Leng, J Chen, H Liu, X Tan, Y Cui, K Wang, L He, S Zhao, ... arXiv preprint arXiv:2212.14518, 2022 | 20 | 2022 |
PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription C Zhang, J Yu, LC Chang, X Tan, J Chen, T Qin, K Zhang International Society for Music Information Retrieval (ISMIR), 2022, 2021 | 19 | 2021 |