An Embarrassingly Simple Approach for LLM with Strong ASR Capacity Z Ma, G Yang, Y Yang, Z Gao, J Wang, Z Du, F Yu, Q Chen, S Zheng, ... arXiv preprint arXiv:2402.08846, 2024 | 42 | 2024 |
Pushing the limits of unsupervised unit discovery for SSL speech representation Z Ma, Z Zheng, G Yang, Y Wang, C Zhang, X Chen Interspeech 2023, 2023 | 9 | 2023 |
MaLa-ASR: Multimedia-Assisted LLM-Based ASR G Yang, Z Ma, F Yu, Z Gao, S Zhang, X Chen Interspeech 2024, 2024 | 8 | 2024 |
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning G Yang, Z Ma, Z Zheng, Y Song, Z Niu, X Chen 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 8 | 2023 |
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers Y Song, Z Chen, X Wang, Z Ma, G Yang, X Chen Interspeech 2024, 2024 | 3 | 2024 |
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Q Chen, Y Chen, Y Chen, M Chen, Y Chen, C Deng, Z Du, R Gao, C Gao, ... arXiv preprint arXiv:2501.06282, 2025 | 2 | 2025 |
CTC-Assisted LLM-Based Contextual ASR G Yang, Z Ma, Z Gao, S Zhang, X Chen SLT 2024, 2024 | 2 | 2024 |
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap G Yang, F Yu, Z Ma, Z Du, Z Gao, S Zhang, X Chen ICASSP 2025, 2024 | | 2024 |