RWF-2000: An Open Large Scale Video Database for Violence Detection M Cheng, K Cai, M Li 2020 25th International Conference on Pattern Recognition (ICPR), 4183-4190, 2021 | 259 | 2021 |
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction M Cheng, W Wang, Y Zhang, X Qin, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 36 | 2023 |
Computer-Aided Autism Spectrum Disorder Diagnosis with Behavior Signal Processing M Cheng, Y Zhang, Y Xie, Y Pan, X Li, W Liu, C Yu, D Zhang, Y Xing, ... IEEE Transactions on Affective Computing 14 (4), 2982-3000, 2023 | 17 | 2023 |
The DKU Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge M Cheng, H Wang, Y Wang, M Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 14 | 2022 |
Voxblink: A Large Scale Speaker Verification Dataset on Camera Y Lin, X Qin, G Zhao, M Cheng, N Jiang, H Wu, M Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 12 | 2024 |
The DKU Post-Challenge Audio-Visual Wake Word Spotting System for The 2021 MISP Challenge: Deep Analysis H Wang, M Cheng, Q Fu, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 10 | 2023 |
The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022 W Wang, X Qin, M Cheng, Y Zhang, K Wang, M Li arXiv preprint arXiv:2210.01677, 2022 | 9 | 2022 |
Multi-Input Multi-Output Target-Speaker Voice Activity Detection for Unified, Flexible, and Robust Audio-Visual Speaker Diarization M Cheng, M Li arXiv preprint arXiv:2401.08052, 2024 | 8 | 2024 |
The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge M Cheng, H Wang, Z Wang, Q Fu, M Li ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
The DKU-MSXF Diarization System for the Voxceleb Speaker Recognition Challenge 2023 M Cheng, W Wang, X Qin, Y Lin, N Jiang, G Zhao, M Li National Conference on Man-Machine Speech Communication, 330-337, 2023 | 7 | 2023 |
Voxblink2: A 100k+ speaker recognition corpus and the open-set speaker-identification benchmark Y Lin, M Cheng, F Zhang, Y Gao, S Zhang, M Li arXiv preprint arXiv:2407.11510, 2024 | 6 | 2024 |
Responsive Social Smile: A Machine Learning based Multimodal Behavior Assessment Framework towards Early Stage Autism Screening Y Pan, K Cai, M Cheng, X Zou, M Li 2020 25th International Conference on Pattern Recognition (ICPR), 2240-2247, 2021 | 6 | 2021 |
Efficient Personal Voice Activity Detection with Wake Word Reference Speech B Zeng, M Cheng, Y Tian, H Liu, M Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 4 | 2024 |
Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer H Wang, M Cheng, Q Fu, M Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 3 | 2024 |
Assessing The Social Skills of Children with Autism Spectrum Disorder via Language-Image Pre-training Models W Liu, M Cheng, Y Pan, L Yuan, S Hu, M Li, S Zeng | 2 | 2023 |
Joint Inference of Speaker Diarization and ASR with Multi-Stage Information Sharing W Wang, D Cai, M Cheng, M Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
Cross-modal Assisted Training for Abnormal Event Recognition in Elevators X Chen, X Gong, M Cheng, Q Deng, M Li Proceedings of the 2021 International Conference on Multimodal Interaction …, 2021 | 1 | 2021 |
Sequence-to-Sequence Neural Diarization with Automatic Speaker Detection and Representation M Cheng, Y Lin, M Li arXiv preprint arXiv:2411.13849, 2024 | | 2024 |
A Multimodal Dynamic Neural Network for Call for Help Recognition in Elevators R Ju, H Chu, Y Wang, Q Deng, M Cheng, M Li Companion Publication of the 2021 International Conference on Multimodal …, 2021 | | 2021 |