The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results H Chen, H Zhou, J Du, CH Lee, J Chen, S Watanabe, SM Siniscalchi, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 54 | 2022 |
Audio-visual speech recognition in misp2021 challenge: Dataset release and deep analysis H Chen, J Du, Y Dai, CH Lee, SM Siniscalchi, S Watanabe, ... Group 1, 2, 2022 | 27 | 2022 |
Deep neural network based regression approach for acoustic echo cancellation Q Lei, H Chen, J Hou, L Chen, L Dai Proceedings of the 2019 4th International Conference on Multimedia Systems …, 2019 | 23 | 2019 |
The multimodal information based speech processing (misp) 2022 challenge: Audio-visual diarization and recognition Z Wang, S Wu, H Chen, MK He, J Du, CH Lee, J Chen, S Watanabe, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 21 | 2023 |
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee Neural Networks 143, 171-182, 2021 | 21 | 2021 |
The ustc-nercslip systems for the chime-7 dasr challenge R Wang, M He, J Du, H Zhou, S Niu, H Chen, Y Yue, G Yang, S Wu, L Sun, ... arXiv preprint arXiv:2308.14638, 2023 | 14 | 2023 |
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries. H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee Interspeech, 3001-3005, 2021 | 11 | 2021 |
The multimodal information based speech processing (misp) 2023 challenge: Audio-visual target speaker extraction S Wu, C Wang, H Chen, Y Dai, C Zhang, R Wang, H Lan, J Du, CH Lee, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 8 | 2024 |
Improving audio-visual speech recognition by lip-subword correlation based visual pre-training and cross-modal fusion encoder Y Dai, H Chen, J Du, X Ding, N Ding, F Jiang, CH Lee 2023 IEEE International Conference on Multimedia and Expo (ICME), 2627-2632, 2023 | 6 | 2023 |
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition Y Dai, H Chen, J Du, R Wang, S Chen, H Wang, CH Lee Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 5 | 2024 |
Deep learning based audio-visual multi-speaker doa estimation using permutation-free loss function Q Wang, H Chen, Y Jiang, Z Wang, Y Wang, J Du, CH Lee 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 5 | 2022 |
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments. H Zhou, J Du, H Chen, Z Jing, S Xiong, CH Lee Interspeech, 341-345, 2021 | 5 | 2021 |
Semi-supervised multi-channel speaker diarization with cross-channel attention S Wu, J Du, MK He, S Niu, H Chen, H Tang, CH Lee 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 4 | 2023 |
Hierarchical audio-visual information fusion with multi-label joint decoding for mer 2023 H Wang, Y Xi, H Chen, J Du, Y Song, Q Wang, H Zhou, C Wang, J Ma, ... Proceedings of the 31st ACM International Conference on Multimedia, 9531-9535, 2023 | 3 | 2023 |
Incorporating visual information reconstruction into progressive learning for optimizing audio-visual speech enhancement CY Zhang, H Chen, J Du, BC Yin, J Pan, CH Lee ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 3 | 2023 |
Grammar-supervised end-to-end speech recognition with part-of-speech tagging and dependency parsing G Wan, T Mao, J Zhang, H Chen, J Gao, Z Ye Applied Sciences 13 (7), 4243, 2023 | 3 | 2023 |
Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition H Chen, Q Wang, J Du, BC Yin, J Pan, CH Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 2 | 2024 |
Incorporating lip features into audio-visual multi-speaker doa estimation by gated fusion Y Jiang, H Chen, J Du, Q Wang, CH Lee ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 2 | 2023 |
Collaborative viseme subword and end-to-end modeling for word-level lip reading H Chen, Q Wang, J Du, GS Wan, SF Xiong, BC Yin, J Pan, CH Lee IEEE Transactions on Multimedia, 2024 | 1 | 2024 |
Summary on the multimodal information based speech processing (MISP) 2022 challenge H Chen, S Wu, Y Dai, Z Wang, J Du, CH Lee, J Chen, S Watanabe, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 1 | 2023 |