Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers C Wang, S Chen, Y Wu, Z Zhang, L Zhou, S Liu, Z Chen, Y Liu, H Wang, ... arXiv preprint arXiv:2301.02111, 2023 | 620 | 2023 |
Speak foreign languages with your own voice: Cross-lingual neural codec language modeling Z Zhang, L Zhou, C Wang, S Chen, Y Wu, S Liu, Z Chen, Y Liu, H Wang, ... arXiv preprint arXiv:2303.03926, 2023 | 157 | 2023 |
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Z Zhang, L Zhou, J Ao, S Liu, L Dai, J Li, F Wei arXiv preprint arXiv:2210.03730, 2022 | 57 | 2022 |
Speechlm: Enhanced speech pre-training with unpaired textual data Z Zhang, S Chen, L Zhou, Y Wu, S Ren, S Liu, Z Yao, X Gong, L Dai, J Li, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 52 | 2024 |
A noise-robust self-supervised pre-training model based speech representation learning for automatic speech recognition QS Zhu, J Zhang, ZQ Zhang, MH Wu, X Fang, LR Dai ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 51 | 2022 |
VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning Q Zhu, L Zhou, Z Zhang, S Liu, B Jiao, J Zhang, L Dai, D Jiang, J Li, F Wei arXiv preprint arXiv:2211.11275, 2022 | 36 | 2022 |
Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition ZQ Zhang, Y Song, MH Wu, X Fang, I McLoughlin, LR Dai Circuits, Systems, and Signal Processing, 1-17, 2022 | 32 | 2022 |
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data J Ao, Z Zhang, L Zhou, S Liu, H Li, T Ko, L Dai, J Li, Y Qian, F Wei arXiv preprint arXiv:2203.17113, 2022 | 19 | 2022 |
Semi-Supervised End-to-End ASR via Teacher-Student Learning with Conditional Posterior Distribution. Z Zhang, Y Song, J Zhang, I McLoughlin, L Dai INTERSPEECH, 3580-3584, 2020 | 12 | 2020 |
The YiTrans Speech Translation System for IWSLT 2022 Offline Shared Task Z Zhang, J Ao Proceedings of the 19th International Conference on Spoken Language …, 2022 | 7 | 2022 |
XLST: Cross-lingual self-training to learn multilingual representation for low resource speech recognition ZQ Zhang, Y Song, MH Wu, X Fang, LR Dai arXiv preprint arXiv:2103.08207, 2021 | | 2021 |