Zipformer: A faster and better encoder for automatic speech recognition Z Yao, L Guo, X Yang, W Kang, F Kuang, Y Yang, Z Jin, L Lin, D Povey arXiv preprint arXiv:2310.11230, 2023 | 90 | 2023 |
Pruned RNN-T for fast, memory-efficient ASR training F Kuang, L Guo, W Kang, L Lin, M Luo, Z Yao, D Povey arXiv preprint arXiv:2206.13236, 2022 | 71 | 2022 |
Libriheavy: A 50,000 hours ASR corpus with punctuation casing and context W Kang, X Yang, Z Yao, F Kuang, Y Yang, L Guo, L Lin, D Povey ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 34 | 2024 |
Fast and parallel decoding for transducer W Kang, L Guo, F Kuang, L Lin, M Luo, Z Yao, X Yang, P Żelasko, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 12 | 2023 |
PromptASR for contextualized ASR with controllable style X Yang, W Kang, Z Yao, Y Yang, L Guo, F Kuang, L Lin, D Povey ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 9 | 2024 |
Blank-regularized ctc for frame skipping in neural transducer Y Yang, X Yang, L Guo, Z Yao, W Kang, F Kuang, L Lin, X Chen, D Povey arXiv preprint arXiv:2305.11558, 2023 | 9 | 2023 |
Predicting multi-codebook vector quantization indexes for knowledge distillation L Guo, X Yang, Q Wang, Y Kong, Z Yao, F Cui, F Kuang, W Kang, L Lin, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 6 | 2023 |
Delay-penalized transducer for low-latency streaming asr W Kang, Z Yao, F Kuang, L Guo, X Yang, L Lin, P Żelasko, D Povey ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 3 | 2023 |
Delay-penalized CTC implemented based on Finite State Transducer Z Yao, W Kang, F Kuang, L Guo, X Yang, Y Yang, L Lin, D Povey arXiv preprint arXiv:2305.11539, 2023 | 3 | 2023 |
LibriheavyMix: a 20,000-hour dataset for single-channel reverberant multi-talker speech separation, ASR and speaker diarization Z Jin, Y Yang, M Shi, W Kang, X Yang, Z Yao, F Kuang, L Guo, L Meng, ... arXiv preprint arXiv:2409.00819, 2024 | 2 | 2024 |
CR-CTC: Consistency regularization on CTC for improved speech recognition Z Yao, W Kang, X Yang, F Kuang, L Guo, H Zhu, Z Jin, Z Li, L Lin, D Povey arXiv preprint arXiv:2410.05101, 2024 | 1 | 2024 |
Method and apparatus for training neural network, and method and apparatus for audio processing W Kang, P Daniel, F Kuang, GUO Liyong, YAO Zengwei, L Lin, ... US Patent App. 18/080,713, 2023 | 1 | 2023 |
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning Y Yang, J Zhuo, Z Jin, Z Ma, X Yang, Z Yao, L Guo, W Kang, F Kuang, ... arXiv preprint arXiv:2411.17100, 2024 | | 2024 |
Method and apparatus for audio processing, electronic device and storage medium LUO Mingshuang, F Kuang, GUO Liyong, L Lin, W Kang, YAO Zengwei, ... US Patent App. 18/078,483, 2023 | | 2023 |
Method of training speech recognition model, electronic device and storage medium YAO Zengwei, GUO Liyong, P Daniel, L Lin, F Kuang, W Kang, ... US Patent App. 18/078,460, 2023 | | 2023 |