A novel learnable dictionary encoding layer for end-to-end language identification W Cai, Z Cai, X Zhang, X Wang, M Li 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 90 | 2018 |
From speaker verification to multispeaker speech synthesis, deep transfer with feedback constraint Z Cai, C Zhang, M Li Proc. Interspeech 2020, 3974--3978, 2020 | 47 | 2020 |
Insights in-to-end learning scheme for language identification W Cai, Z Cai, W Liu, X Wang, M Li 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 40 | 2018 |
Polyphone disambiguation for mandarin chinese using conditional neural network with multi-level embedding features Z Cai, Y Yang, C Zhang, X Qin, M Li Proc. Interspeech 2019, 2110--2114, 2019 | 30 | 2019 |
Waveform boundary detection for partially spoofed audio Z Cai, W Wang, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 27 | 2023 |
Sig-vc: A speaker information guided zero-shot voice conversion system for both human beings and machines H Zhang, Z Cai, X Qin, M Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 18 | 2022 |
Identifying source speakers for voice conversion based spoofing attacks on speaker verification systems D Cai, Z Cai, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 16 | 2023 |
Cross-lingual multi-speaker speech synthesis with limited bilingual training data Z Cai, Y Yang, M Li Computer Speech & Language 77, 101427, 2023 | 16 | 2023 |
End-to-end language identification using NetFV and NetVLAD J Chen, W Cai, D Cai, Z Cai, H Zhong, M Li 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 15 | 2018 |
The DKU-JNU-EMA electromagnetic articulography database on Mandarin and Chinese dialects with tandem feature based acoustic-to-articulatory inversion Z Cai, X Qin, D Cai, M Li, X Liu, H Zhong 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 14 | 2018 |
Cross-lingual multispeaker text-to-speech under limited-data scenario Z Cai, Y Yang, M Li arXiv preprint arXiv:2005.10441, 2020 | 13 | 2020 |
Integrating frame-level boundary detection and deepfake detection for locating manipulated regions in partially spoofed audio forgery attacks Z Cai, M Li Computer Speech & Language 85, 101597, 2024 | 9 | 2024 |
Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion Y Yang, H Zhang, Z Cai, Y Shi, M Li, D Zhang, X Ding, J Deng, J Wang Biomedical Signal Processing and Control 80, 104279, 2023 | 8 | 2023 |
Deep speaker embeddings with convolutional neural network on supervector for text-independent speaker recognition D Cai, Z Cai, M Li 2018 Asia-Pacific Signal and Information Processing Association Annual …, 2018 | 7 | 2018 |
The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023 Z Cai, W Wang, Y Wang, M Li arXiv preprint arXiv:2308.10281, 2023 | 6 | 2023 |
Unsupervised query by example spoken term detection using features concatenated with self-organizing map distances H Wu, M Li, Z Cai, H Zhong 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 4 | 2018 |
Invertible Voice Conversion Z Cai, M Li arXiv preprint arXiv:2201.10687, 2022 | 2 | 2022 |
F0 Contour Estimation Using Phonetic Feature in Electrolaryngeal Speech Enhancement Z Cai, Z Xu, M Li ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 2 | 2019 |
The DKU Speech Synthesis System for 2019 Blizzard Challenge Z Cai, C Zhang, Y Yang, M Li Blizzard Challenge Workshop, 2019 | 2 | 2019 |
HLTCOE JHU submission to the Voice Privacy challenge 2024 HL Xinyuan, Z Cai, A Garg, K Duh, LP García-Perera, S Khudanpur, ... arXiv preprint arXiv:2409.08913, 2024 | 1 | 2024 |