Zengwei Yao

Cited by

	All	Since 2020
Citations	532	527
h-index	9	9
i10-index	8	8

240

120

180

20192020202120222023202420255 23 47 76 124 236 21

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Fangjun KuangXiaomiVerified email at xiaomi.com
Wei KangSenior engineer, Xiaomi Corp.Verified email at xiaomi.com
Daniel PoveyChief Speech Scientist, Xiaomi Corp.Verified email at xiaomi.com
Xiaoyu YangUniversity of CambridgeVerified email at cam.ac.uk
Mingshuang LuoICT, UCAS, Peng Cheng LabVerified email at mails.ucas.ac.cn
Weihuang LiuUniversity of MacauVerified email at um.edu.mo
Jiahui PanSouth China Normal UniversityVerified email at m.scnu.edu.cn
Zengrui JinThe Chinese University of Hong KongVerified email at se.cuhk.edu.hk
Piotr ŻelaskoPrincipal Research Scientist @ NvidiaVerified email at nvidia.com
Wenjie PeiHarbin Institute of Technology, Shenzhen; Delft University of TechnologyVerified email at hit.edu.cn
Guangming LuHarbin Institute of Technology, ShenzhenVerified email at hit.edu.cn
Quandong WangSenior Speech Engineer, Xiaomi Corporation, Beijing, ChinaVerified email at xiaomi.com
Yifan YangShanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Fanglin ChenAssociate Professor, Harbin Institute of Technology, ShenzhenVerified email at hit.edu.cn
Zhaoqing LiThe Chinese university of Hong KongVerified email at cuhk.edu.hk
Han ZhuInstitute of Acoustics, Chinese Academy of SciencesVerified email at hccl.ioa.ac.cn
Liyong Guo

Zengwei Yao

Machine Learning Engineer, Xiaomi Corp.

Verified email at xiaomi.com - Homepage

speech recognition deep learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN Z Yao, Z Wang, W Liu, Y Liu, J Pan Speech Communication 120, 11-19, 2020	157	2020
Convolutional two-stream network using multi-facial feature fusion for driver fatigue detection W Liu, J Qian, Z Yao, X Jiao, J Pan Future Internet 11 (5), 115, 2019	111	2019
Zipformer: A faster and better encoder for automatic speech recognition Z Yao, L Guo, X Yang, W Kang, F Kuang, Y Yang, Z Jin, L Lin, D Povey ICLR 2024 (Oral), 2023	83	2023
Pruned RNN-T for fast, memory-efficient ASR training F Kuang, L Guo, W Kang, L Lin, M Luo, Z Yao, D Povey Proc. INTERSPEECH 2022, 2068--2072, 2022	70	2022
Libriheavy: a 50,000 hours asr corpus with punctuation casing and context W Kang, X Yang, Z Yao, F Kuang, Y Yang, L Guo, L Lin, D Povey ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	35	2024
Fingerprint restoration using cubic Bezier curve Y Tu, Z Yao, J Xu, Y Liu, Z Zhang BMC bioinformatics 21, 1-19, 2020	19	2020
Fast and parallel decoding for transducer W Kang, L Guo, F Kuang, L Lin, M Luo, Z Yao, X Yang, P Żelasko, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	12	2023
Stepwise-refining speech separation network via fine-grained encoding in high-order latent domain Z Yao, W Pei, F Chen, G Lu, D Zhang IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 378-393, 2022	12	2022
PromptASR for contextualized ASR with controllable style X Yang, W Kang, Z Yao, Y Yang, L Guo, F Kuang, L Lin, D Povey ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	9	2024
Blank-regularized CTC for Frame Skipping in Neural Transducer Y Yang, X Yang, L Guo, Z Yao, W Kang, F Kuang, L Lin, X Chen, D Povey Proc. INTERSPEECH 2023, 4409--4413, 2023	9	2023
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation L Guo, X Yang, Q Wang, Y Kong, Z Yao, F Cui, F Kuang, W Kang, L Lin, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	6	2023
Delay-penalized transducer for low-latency streaming ASR W Kang, Z Yao, F Kuang, L Guo, X Yang, L Lin, P Żelasko, D Povey ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	3	2023
Delay-penalized CTC implemented based on Finite State Transducer Z Yao, W Kang, F Kuang, L Guo, X Yang, Y Yang, L Lin, D Povey Proc. INTERSPEECH 2023, 1329--1333, 2023	3	2023
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization Z Jin, Y Yang, M Shi, W Kang, X Yang, Z Yao, F Kuang, L Guo, L Meng, ... Proc. INTERSPEECH 2024, 702-706, 2024	2	2024
METHOD AND APPARATUS FOR TRAINING NEURAL NETWORK, AND METHOD AND APPARATUS FOR AUDIO PROCESSING W Kang, P Daniel, F Kuang, L Guo, Z Yao, L Lin, M Luo US Patent App. 18/080,713, 2023	1	2023
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning Y Yang, J Zhuo, Z Jin, Z Ma, X Yang, Z Yao, L Guo, W Kang, F Kuang, ... arXiv preprint arXiv:2411.17100, 2024		2024
CR-CTC: Consistency regularization on CTC for improved speech recognition Z Yao, W Kang, X Yang, F Kuang, L Guo, H Zhu, Z Jin, Z Li, L Lin, D Povey ICLR 2025, 2024		2024
METHOD AND APPARATUS FOR AUDIO PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM M Luo, F Kuang, L Guo, L Lin, W Kang, Z Yao, P Daniel US Patent App. 18/078,483, 2023		2023
METHOD OF TRAINING SPEECH RECOGNITION MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM Z Yao, L Guo, P Daniel, L Lin, F Kuang, W Kang, M Luo, Q Wang, Y Kong US Patent App. 18/078,460, 2023		2023
Semantic-Aware Local-Global Vision Transformer J Zhang, Z Yao, F Chen, G Lu, W Pei arXiv preprint arXiv:2211.14705, 2022		2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors