Waveform modeling and generation using hierarchical recurrent neural networks for speech bandwidth extension ZH Ling, Y Ai, Y Gu, LR Dai IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (5), 883-894, 2018 | 89 | 2018 |
MP-SENet: A speech enhancement model with parallel denoising of magnitude and phase spectra YX Lu, Y Ai, ZH Ling arXiv preprint arXiv:2305.13686, 2023 | 61 | 2023 |
Singing voice synthesis using deep autoregressive neural networks for acoustic modeling YH Yi, Y Ai, ZH Ling, LR Dai arXiv preprint arXiv:1906.08977, 2019 | 42 | 2019 |
A neural vocoder with hierarchical generation of amplitude and phase spectra for statistical parametric speech synthesis Y Ai, ZH Ling IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 839-851, 2020 | 40 | 2020 |
SampleRNN-based neural vocoder for statistical parametric speech synthesis Y Ai, HC Wu, ZH Ling 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 33 | 2018 |
Bddr: An effective defense against textual backdoor attacks K Shao, J Yang, Y Ai, H Liu, Y Zhang Computers & Security 110, 102433, 2021 | 32 | 2021 |
Neural speech phase prediction based on parallel estimation architecture and anti-wrapping losses Y Ai, ZH Ling ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 26 | 2023 |
APCodec: A neural audio codec with parallel amplitude and phase spectrum encoding and decoding Y Ai, XH Jiang, YX Lu, HP Du, ZH Ling IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 20 | 2024 |
APNet: An all-frame-level neural vocoder incorporating direct prediction of amplitude and phase spectra Y Ai, ZH Ling IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 2145-2157, 2023 | 15 | 2023 |
DNN-based spectral enhancement for neural waveform generators with low-bit quantization Y Ai, JX Zhang, L Chen, ZH Ling ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 13 | 2019 |
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge. H Wu, Z Li, L Xu, Z Zhang, W Zhao, B Gu, Y Ai, Y Lu, J Zhang, Z Ling, ... DADA@ IJCAI, 119-124, 2023 | 10 | 2023 |
Knowledge-and-data-driven amplitude spectrum prediction for hierarchical neural vocoders Y Ai, ZH Ling arXiv preprint arXiv:2004.07832, 2020 | 9 | 2020 |
APNet2: high-quality and high-efficiency neural vocoder with direct prediction of amplitude and phase spectra HP Du, YX Lu, Y Ai, ZH Ling National Conference on Man-Machine Speech Communication, 66-80, 2023 | 8 | 2023 |
Denoising-and-dereverberation hierarchical neural vocoder for statistical parametric speech synthesis Y Ai, ZH Ling, WL Wu, A Li IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2036-2048, 2022 | 7 | 2022 |
Towards high-quality and efficient speech bandwidth extension with parallel amplitude and phase prediction YX Lu, Y Ai, HP Du, ZH Ling IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 6 | 2024 |
Explicit estimation of magnitude and phase spectra in parallel for high-quality speech enhancement YX Lu, Y Ai, ZH Ling arXiv preprint arXiv:2308.08926, 2023 | 6 | 2023 |
Face-driven zero-shot voice conversion with memory-based face-voice alignment ZY Sheng, Y Ai, YN Chen, ZH Ling Proceedings of the 31st ACM International Conference on Multimedia, 8443-8452, 2023 | 5 | 2023 |
Zero-shot personalized lip-to-speech synthesis with face image based voice control ZY Sheng, Y Ai, ZH Ling ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 5 | 2023 |
Incorporating ultrasound tongue images for audio-visual speech enhancement through knowledge distillation RC Zheng, Y Ai, ZH Ling arXiv preprint arXiv:2305.14933, 2023 | 5 | 2023 |
Reverberation modeling for source-filter-based neural vocoder Y Ai, X Wang, J Yamagishi, ZH Ling arXiv preprint arXiv:2005.07379, 2020 | 5 | 2020 |