Follow
Zengwei Yao
Zengwei Yao
Machine Learning Engineer, Xiaomi Corp.
Verified email at xiaomi.com - Homepage
Title
Cited by
Cited by
Year
Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN
Z Yao, Z Wang, W Liu, Y Liu, J Pan
Speech Communication 120, 11-19, 2020
1572020
Convolutional two-stream network using multi-facial feature fusion for driver fatigue detection
W Liu, J Qian, Z Yao, X Jiao, J Pan
Future Internet 11 (5), 115, 2019
1112019
Zipformer: A faster and better encoder for automatic speech recognition
Z Yao, L Guo, X Yang, W Kang, F Kuang, Y Yang, Z Jin, L Lin, D Povey
ICLR 2024 (Oral), 2023
832023
Pruned RNN-T for fast, memory-efficient ASR training
F Kuang, L Guo, W Kang, L Lin, M Luo, Z Yao, D Povey
Proc. INTERSPEECH 2022, 2068--2072, 2022
702022
Libriheavy: a 50,000 hours asr corpus with punctuation casing and context
W Kang, X Yang, Z Yao, F Kuang, Y Yang, L Guo, L Lin, D Povey
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
352024
Fingerprint restoration using cubic Bezier curve
Y Tu, Z Yao, J Xu, Y Liu, Z Zhang
BMC bioinformatics 21, 1-19, 2020
192020
Fast and parallel decoding for transducer
W Kang, L Guo, F Kuang, L Lin, M Luo, Z Yao, X Yang, P Żelasko, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
122023
Stepwise-refining speech separation network via fine-grained encoding in high-order latent domain
Z Yao, W Pei, F Chen, G Lu, D Zhang
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 378-393, 2022
122022
PromptASR for contextualized ASR with controllable style
X Yang, W Kang, Z Yao, Y Yang, L Guo, F Kuang, L Lin, D Povey
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
92024
Blank-regularized CTC for Frame Skipping in Neural Transducer
Y Yang, X Yang, L Guo, Z Yao, W Kang, F Kuang, L Lin, X Chen, D Povey
Proc. INTERSPEECH 2023, 4409--4413, 2023
92023
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
L Guo, X Yang, Q Wang, Y Kong, Z Yao, F Cui, F Kuang, W Kang, L Lin, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
62023
Delay-penalized transducer for low-latency streaming ASR
W Kang, Z Yao, F Kuang, L Guo, X Yang, L Lin, P Żelasko, D Povey
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Delay-penalized CTC implemented based on Finite State Transducer
Z Yao, W Kang, F Kuang, L Guo, X Yang, Y Yang, L Lin, D Povey
Proc. INTERSPEECH 2023, 1329--1333, 2023
32023
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Z Jin, Y Yang, M Shi, W Kang, X Yang, Z Yao, F Kuang, L Guo, L Meng, ...
Proc. INTERSPEECH 2024, 702-706, 2024
22024
METHOD AND APPARATUS FOR TRAINING NEURAL NETWORK, AND METHOD AND APPARATUS FOR AUDIO PROCESSING
W Kang, P Daniel, F Kuang, L Guo, Z Yao, L Lin, M Luo
US Patent App. 18/080,713, 2023
12023
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning
Y Yang, J Zhuo, Z Jin, Z Ma, X Yang, Z Yao, L Guo, W Kang, F Kuang, ...
arXiv preprint arXiv:2411.17100, 2024
2024
CR-CTC: Consistency regularization on CTC for improved speech recognition
Z Yao, W Kang, X Yang, F Kuang, L Guo, H Zhu, Z Jin, Z Li, L Lin, D Povey
ICLR 2025, 2024
2024
METHOD AND APPARATUS FOR AUDIO PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM
M Luo, F Kuang, L Guo, L Lin, W Kang, Z Yao, P Daniel
US Patent App. 18/078,483, 2023
2023
METHOD OF TRAINING SPEECH RECOGNITION MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM
Z Yao, L Guo, P Daniel, L Lin, F Kuang, W Kang, M Luo, Q Wang, Y Kong
US Patent App. 18/078,460, 2023
2023
Semantic-Aware Local-Global Vision Transformer
J Zhang, Z Yao, F Chen, G Lu, W Pei
arXiv preprint arXiv:2211.14705, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–20