Semi-orthogonal low-rank matrix factorization for deep neural networks. D Povey, G Cheng, Y Wang, K Li, H Xu, M Yarmohammadi, S Khudanpur Interspeech, 3743-3747, 2018 | 655 | 2018 |
An Exploration of Dropout with LSTMs. G Cheng, V Peddinti, D Povey, V Manohar, S Khudanpur, Y Yan Interspeech, 1586-1590, 2017 | 157 | 2017 |
Transformer-based online CTC/attention end-to-end speech recognition architecture H Miao, G Cheng, C Gao, P Zhang, Y Yan ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 146 | 2020 |
Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition H Miao, G Cheng, P Zhang, T Li, Y Yan Proc. Interspeech 2019, 2623-2627, 2019 | 60 | 2019 |
Online hybrid CTC/attention end-to-end automatic speech recognition architecture H Miao, G Cheng, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 1452-1465, 2020 | 59 | 2020 |
Open source magicdata-ramc: A rich annotated mandarin conversational (ramc) speech dataset Z Yang, Y Chen, L Luo, R Yang, L Ye, G Cheng, J Xu, Y Jin, Q Zhang, ... arXiv preprint arXiv:2203.16844, 2022 | 43 | 2022 |
Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models K Deng, Z Yang, S Watanabe, Y Higuchi, G Cheng, P Zhang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 33 | 2022 |
Improving CTC-based speech recognition via knowledge transferring from pre-trained language models K Deng, S Cao, Y Zhang, L Ma, G Cheng, J Xu, P Zhang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 32 | 2022 |
Output-Gate Projected Gated Recurrent Unit for Speech Recognition G Cheng, D Povey, L Huang, J Xu, S Khudanpur, Y Yan Proc. Interspeech 2018, 1793-1797, 2018 | 31 | 2018 |
Pre-training transformer decoder for end-to-end asr model with unpaired text data C Gao, G Cheng, R Yang, H Zhu, P Zhang, Y Yan ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 24 | 2021 |
Alleviating asr long-tailed problem by decoupling the learning of representation and classification K Deng, G Cheng, R Yang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 340-354, 2021 | 23 | 2021 |
ETEH: Unified attention-based end-to-end ASR and KWS architecture G Cheng, H Miao, R Yang, K Deng, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1360-1373, 2022 | 21 | 2022 |
Keyword search using attention-based end-to-end asr and frame-synchronous phoneme alignments R Yang, G Cheng, H Miao, T Li, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3202-3215, 2021 | 18 | 2021 |
Alternative pseudo-labeling for semi-supervised automatic speech recognition H Zhu, D Gao, G Cheng, D Povey, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 3320-3330, 2023 | 14 | 2023 |
Decoupled federated learning for asr with non-iid data H Zhu, J Wang, G Cheng, P Zhang, Y Yan arXiv preprint arXiv:2206.09102, 2022 | 13 | 2022 |
Boosting cross-domain speech recognition with self-supervision H Zhu, G Cheng, J Wang, W Hou, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 32, 471-485, 2023 | 12 | 2023 |
The conversational short-phrase speaker diarization (cssd) task: Dataset, evaluation metric and baselines G Cheng, Y Chen, R Yang, Q Li, Z Yang, L Ye, P Zhang, Q Zhang, L Xie, ... 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 10 | 2022 |
Self-supervised pre-training for attention-based encoder-decoder asr model C Gao, G Cheng, T Li, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1763-1774, 2022 | 10 | 2022 |
Wav2vec-S: Semi-supervised pre-training for low-resource ASR H Zhu, L Wang, J Wang, G Cheng, P Zhang, Y Yan arXiv preprint arXiv:2110.04484, 2021 | 10 | 2021 |
Using highway connections to enable deep small‐footprint Lstm‐Rnns for speech recognition G Cheng, X Li, Y Yan Chinese Journal of Electronics 28 (1), 107-112, 2019 | 10 | 2019 |