Volgen
Hang Chen
Titel
Geciteerd door
Geciteerd door
Jaar
The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results
H Chen, H Zhou, J Du, CH Lee, J Chen, S Watanabe, SM Siniscalchi, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
542022
Audio-visual speech recognition in misp2021 challenge: Dataset release and deep analysis
H Chen, J Du, Y Dai, CH Lee, SM Siniscalchi, S Watanabe, ...
Group 1, 2, 2022
272022
Deep neural network based regression approach for acoustic echo cancellation
Q Lei, H Chen, J Hou, L Chen, L Dai
Proceedings of the 2019 4th International Conference on Multimedia Systems …, 2019
232019
The multimodal information based speech processing (misp) 2022 challenge: Audio-visual diarization and recognition
Z Wang, S Wu, H Chen, MK He, J Du, CH Lee, J Chen, S Watanabe, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
212023
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement
H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee
Neural Networks 143, 171-182, 2021
212021
The ustc-nercslip systems for the chime-7 dasr challenge
R Wang, M He, J Du, H Zhou, S Niu, H Chen, Y Yue, G Yang, S Wu, L Sun, ...
arXiv preprint arXiv:2308.14638, 2023
142023
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries.
H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee
Interspeech, 3001-3005, 2021
112021
The multimodal information based speech processing (misp) 2023 challenge: Audio-visual target speaker extraction
S Wu, C Wang, H Chen, Y Dai, C Zhang, R Wang, H Lan, J Du, CH Lee, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
82024
Improving audio-visual speech recognition by lip-subword correlation based visual pre-training and cross-modal fusion encoder
Y Dai, H Chen, J Du, X Ding, N Ding, F Jiang, CH Lee
2023 IEEE International Conference on Multimedia and Expo (ICME), 2627-2632, 2023
62023
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Y Dai, H Chen, J Du, R Wang, S Chen, H Wang, CH Lee
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
52024
Deep learning based audio-visual multi-speaker doa estimation using permutation-free loss function
Q Wang, H Chen, Y Jiang, Z Wang, Y Wang, J Du, CH Lee
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
52022
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.
H Zhou, J Du, H Chen, Z Jing, S Xiong, CH Lee
Interspeech, 341-345, 2021
52021
Semi-supervised multi-channel speaker diarization with cross-channel attention
S Wu, J Du, MK He, S Niu, H Chen, H Tang, CH Lee
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
42023
Hierarchical audio-visual information fusion with multi-label joint decoding for mer 2023
H Wang, Y Xi, H Chen, J Du, Y Song, Q Wang, H Zhou, C Wang, J Ma, ...
Proceedings of the 31st ACM International Conference on Multimedia, 9531-9535, 2023
32023
Incorporating visual information reconstruction into progressive learning for optimizing audio-visual speech enhancement
CY Zhang, H Chen, J Du, BC Yin, J Pan, CH Lee
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Grammar-supervised end-to-end speech recognition with part-of-speech tagging and dependency parsing
G Wan, T Mao, J Zhang, H Chen, J Gao, Z Ye
Applied Sciences 13 (7), 4243, 2023
32023
Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition
H Chen, Q Wang, J Du, BC Yin, J Pan, CH Lee
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
22024
Incorporating lip features into audio-visual multi-speaker doa estimation by gated fusion
Y Jiang, H Chen, J Du, Q Wang, CH Lee
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
22023
Collaborative viseme subword and end-to-end modeling for word-level lip reading
H Chen, Q Wang, J Du, GS Wan, SF Xiong, BC Yin, J Pan, CH Lee
IEEE Transactions on Multimedia, 2024
12024
Summary on the multimodal information based speech processing (MISP) 2022 challenge
H Chen, S Wu, Y Dai, Z Wang, J Du, CH Lee, J Chen, S Watanabe, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
12023
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20