Exploring the encoding layer and loss function in end-to-end speaker and language recognition system W Cai, J Chen, M Li arXiv preprint arXiv:1804.05160, 2018 | 417 | 2018 |
On-the-fly data loader and utterance-level aggregation for speaker and language recognition W Cai, J Chen, J Zhang, M Li IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 1038-1051, 2020 | 97 | 2020 |
A novel learnable dictionary encoding layer for end-to-end language identification W Cai, Z Cai, X Zhang, X Wang, M Li 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 90 | 2018 |
Countermeasures for Automatic Speaker Verification Replay Spoofing Attack: On Data Augmentation, Feature Representation, Classification and Fusion. W Cai, D Cai, W Liu, G Li, M Li Interspeech, 17-21, 2017 | 87 | 2017 |
The DKU replay detection system for the ASVspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion W Cai, H Wu, D Cai, M Li arXiv preprint arXiv:1907.02663, 2019 | 74 | 2019 |
Utterance-level end-to-end language identification using attention-based CNN-BLSTM W Cai, D Cai, S Huang, M Li ICASSP 2019-2019 IEEE international conference on acoustics, speech and …, 2019 | 69 | 2019 |
Within-sample variability-invariant loss for robust speaker recognition under noisy environments D Cai, W Cai, M Li ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 60 | 2020 |
Analysis of length normalization in end-to-end speaker verification system W Cai, J Chen, M Li arXiv preprint arXiv:1806.03209, 2018 | 45 | 2018 |
Insights into End-to-End Learning Scheme for Language Identification W Cai, Z Cai, W Liu, X Wang, M Li 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 40 | 2018 |
End-to-End Deep Learning Framework for Speech Paralinguistics Detection Based on Perception Aware Spectrum. D Cai, Z Ni, W Liu, W Cai, G Li, M Li, D Cai, Z Ni, W Liu, W Cai INTERSPEECH, 3452-3456, 2017 | 28 | 2017 |
Generalized i-vector representation with phonetic tokenizations and tandem features for both text independent and text dependent speaker verification M Li, L Liu, W Cai, W Liu Journal of Signal Processing Systems 82, 207-215, 2016 | 23 | 2016 |
DIHARD II is still hard: Experimental results and discussions from the DKU-LENOVO team Q Lin, W Cai, L Yang, J Wang, J Zhang, M Li arXiv preprint arXiv:2002.12761, 2020 | 22 | 2020 |
Text-independent voice conversion using deep neural network based phonetic level features H Zheng, W Cai, T Zhou, S Zhang, M Li 2016 23rd International Conference on Pattern Recognition (ICPR), 2872-2877, 2016 | 19 | 2016 |
The sysu system for the interspeech 2015 automatic speaker verification spoofing and countermeasures challenge S Weng, S Chen, L Yu, X Wu, W Cai, Z Liu, Y Zhou, M Li 2015 Asia-Pacific Signal and Information Processing Association Annual …, 2015 | 19 | 2015 |
The DKU system for the speaker recognition task of the 2019 VOiCES from a distance challenge D Cai, X Qin, W Cai, M Li arXiv preprint arXiv:1907.02194, 2019 | 16 | 2019 |
End-to-end language identification using NetFV and NetVLAD J Chen, W Cai, D Cai, Z Cai, H Zhong, M Li 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 15 | 2018 |
Speaker diarization system for autism children's real-life audio data T Zhou, W Cai, X Chen, X Zou, S Zhang, M Li 2016 10th International Symposium on Chinese Spoken Language Processing …, 2016 | 10 | 2016 |
Locality sensitive discriminant analysis for speaker verification D Cai, W Cai, Z Ni, M Li 2016 Asia-Pacific Signal and Information Processing Association Annual …, 2016 | 6 | 2016 |
A unified deep speaker embedding framework for mixed-bandwidth speech data W Cai, M Li 2021 Asia-Pacific Signal and Information Processing Association Annual …, 2021 | 5 | 2021 |
Duration dependent covariance regularization in PLDA modeling for speaker verification. W Cai, M Li, L Li, Q Hong INTERSPEECH, 1027-1031, 2015 | 4 | 2015 |