Stebėti
Zhendong Peng
Zhendong Peng
Tsinghua University
Patvirtintas el. paštas tsinghua.org.cn
Pavadinimas
Cituota
Cituota
Metai
Wenet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit
Z Yao, D Wu, X Wang, B Zhang, F Yu, C Yang, Z Peng, X Chen, L Xie, ...
arXiv preprint arXiv:2102.01547, 2021
2882021
Wenetspeech: A 10000+ hours multi-domain mandarin corpus for speech recognition
B Zhang, H Lv, P Guo, Q Shao, C Yang, L Xie, X Xu, H Bu, X Chen, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
2232022
Wespeaker: A research and production oriented speaker embedding learning toolkit
H Wang, C Liang, S Wang, Z Chen, B Zhang, X Xiang, Y Deng, Y Qian
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
1192023
Wenet 2.0: More productive end-to-end speech recognition toolkit
B Zhang, D Wu, Z Peng, X Song, Z Yao, H Lv, L Xie, C Yang, F Pan, J Niu
arXiv preprint arXiv:2203.15455, 2022
1052022
U2++: Unified two-pass bidirectional end-to-end model for speech recognition
D Wu, B Zhang, C Yang, Z Peng, W Xia, X Chen, X Lei
arXiv preprint arXiv:2106.05642, 2021
552021
Zeroprompt: Streaming acoustic encoders are zero-shot masked lms
X Song, D Wu, B Zhang, Z Peng, B Dang, F Pan, Z Wu
arXiv preprint arXiv:2305.10649, 2023
292023
Wenet: Production first and production ready end-to-end speech recognition toolkit
B Zhang, D Wu, C Yang, X Chen, Z Peng, X Wang, Z Yao, X Wang, F Yu, ...
arXiv e-prints, arXiv: 2102.01547, 2021
262021
ABFL: an autoencoder based practical approach for software fault localization
Z Peng, X Xiao, G Hu, AK Sangaiah, M Atiquzzaman, S Xia
Information sciences 510, 108-121, 2020
262020
Lightgrad: Lightweight diffusion probabilistic model for text-to-speech
J Chen, X Song, Z Peng, B Zhang, F Pan, Z Wu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
152023
Branch-ECAPA-TDNN: A parallel branch architecture to capture local and global features for speaker verification
J Yao, C Liang, Z Peng, B Zhang, XL Zhang
Proc. Interspeech, 1943-1947, 2023
152023
Fast-u2++: Fast and accurate end-to-end speech recognition in joint ctc/attention frames
C Liang, XL Zhang, BB Zhang, D Wu, S Li, X Song, Z Peng, F Pan
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
112023
Trimtail: Low-latency streaming asr with simple but effective spectrogram-level length penalty
X Song, D Wu, Z Wu, B Zhang, Y Zhang, Z Peng, W Li, F Pan, C Zhu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
92023
U2++ moe: Scaling 4.7 x parameters with minimal impact on rtf
X Song, D Wu, B Zhang, D Zhou, Z Peng, B Dang, F Pan, C Yang
arXiv preprint arXiv:2404.16407, 2024
62024
TouchASP: Elastic Automatic Speech Perception that Everyone Can Touch
X Song, C Liang, B Zhang, P Zhang, ZY Wang, Y Ma, M Xu, L Wang, ...
arXiv preprint arXiv:2412.15622, 2024
22024
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch
X Song, M Xing, C Ma, S Li, D Wu, B Zhang, F Pan, D Zhou, Y Zhang, ...
arXiv preprint arXiv:2412.08237, 2024
22024
Fusionformer: Fusing operations in transformer for efficient streaming speech recognition
X Song, D Wu, B Zhang, Z Wu, W Li, D Li, P Zhang, Z Peng, F Pan, C Zhu, ...
arXiv preprint arXiv:2210.17079, 2022
22022
Non-local self-attention structure for function approximation in deep reinforcement learning
Z Wang, X Xiao, G Hu, Y Yao, D Zhang, Z Peng, Q Li, S Xia
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
22019
Hydraformer: One Encoder for All Subsampling Rates
Y Xu, X Song, Z Wu, D Wu, Z Peng, B Zhang
2024 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2024
2024
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
B Zhang, Z Peng, B Dang, F Pan, Z Wu
Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.
Straipsniai 1–19