A noise-robust self-supervised pre-training model based speech representation learning for automatic speech recognition QS Zhu, J Zhang, ZQ Zhang, MH Wu, X Fang, LR Dai ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 51 | 2022 |
A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition QS Zhu, J Zhang, ZQ Zhang, LR Dai IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 48* | 2023 |
VatLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning Q Zhu, L Zhou, Z Zhang, S Liu, B Jiao, J Zhang, L Dai, D Jiang, J Li, F Wei IEEE Transactions on Multimedia, 2023 | 38 | 2023 |
Robust data2vec: Noise-robust speech representation learning for asr by combining regression and improved contrastive learning QS Zhu, L Zhou, J Zhang, SJ Liu, YC Hu, LR Dai ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 34 | 2023 |
Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition Y Hu, C Chen, R Li, Q Zhu, ES Chng ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 27 | 2023 |
Supervised and self-supervised pretraining based COVID-19 detection using acoustic breathing/cough/speech signals XY Chen*, QS Zhu*, J Zhang, LR Dai *:Equal Contribution; ICASSP 2022-2022 IEEE International Conference on …, 2022 | 19 | 2022 |
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions J Zhang, QT Xu, QS Zhu, ZH Ling Interspeech 2023, 2023 | 18 | 2023 |
Wav2code: Restore clean speech representations via codebook lookup for noise-robust asr Y Hu, C Chen, Q Zhu, ES Chng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 13 | 2023 |
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition Y Hu, R Li, C Chen, H Zou, Q Zhu, ES Chng IJCAI 2023, 2023 | 12 | 2023 |
Noise-aware Speech Enhancement using Diffusion Probabilistic Model Y Hu, C Chen, R Li, Q Zhu, ES Chng arXiv preprint arXiv:2307.08029, 2023 | 11 | 2023 |
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition Y Hu, R Li, C Chen, C Qin, Q Zhu, ES Chng ACL 2023, 2023 | 10 | 2023 |
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation Q Zhu, J Zhang, Y Gu, Y Hu, L Dai Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19768 …, 2024 | 8 | 2024 |
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models Y Hu, C Chen, C Qin, Q Zhu, ES Chng, R Li arXiv preprint arXiv:2405.10025, 2024 | 6 | 2024 |
Rep2wav: Noise robust text-to-speech using self-supervised representations Q Zhu, Y Gu, R Chen, C Weng, Y Hu, L Dai, J Zhang arXiv preprint arXiv:2308.14553, 2023 | 5 | 2023 |
Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization XY Zhao, QS Zhu, J Zhang 2022 Asia-Pacific Signal and Information Processing Association Annual …, 2022 | 4 | 2022 |
An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition. Q Zhu, J Zhang, M Wu, X Fang, LR Dai Interspeech, 4334-4338, 2021 | 4 | 2021 |
Eeg2vec: Self-Supervised Electroencephalographic Representation Learning Q Zhu, X Zhao, J Zhang, Y Gu, C Weng, Y Hu arXiv preprint arXiv:2305.13957, 2023 | 2 | 2023 |
An Experimental Comparison of Noise-Robust Text-To-Speech Synthesis Systems Based On Self-Supervised Representation X Zhao, Q Zhu, Y Hu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
A Complementary Joint Training Approach Using Unpaired Speech and Text A Complementary Joint Training Approach Using Unpaired Speech and Text Y Du, J Zhang, Q Zhu, L Dai, MH Wu, X Fang, ZW Yang Proc. Interspeech 2022, 2613-2617, 2022 | 1 | 2022 |
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis Y Gu, Q Zhu, G Lei, C Weng, D Su ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |