Transformer-based online CTC/attention end-to-end speech recognition architecture H Miao, G Cheng, C Gao, P Zhang, Y Yan ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 146 | 2020 |
DPT-FSNet: Dual-path transformer based full-band and sub-band fusion network for speech enhancement F Dang, H Chen, P Zhang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 121 | 2022 |
Using neural network front-ends on far field multiple microphones based speech recognition Y Liu, P Zhang, T Hain 2014 IEEE international conference on acoustics, speech and signal …, 2014 | 110 | 2014 |
The effect of silence and dual-band fusion in anti-spoofing system Y Zhang12, W Wang12, P Zhang12 Proc. Interspeech, 4279-4283, 2021 | 98 | 2021 |
Integrating the data augmentation scheme with various classifiers for acoustic scene modeling H Chen, Z Liu, Z Liu, P Zhang, Y Yan arXiv preprint arXiv:1907.06639, 2019 | 89 | 2019 |
Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. H Miao, G Cheng, P Zhang, T Li, Y Yan Interspeech, 2623-2627, 2019 | 60 | 2019 |
Online hybrid CTC/attention end-to-end automatic speech recognition architecture H Miao, G Cheng, P Zhang, Y Yan IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 1452-1465, 2020 | 59 | 2020 |
Open source magicdata-ramc: A rich annotated mandarin conversational (ramc) speech dataset Z Yang, Y Chen, L Luo, R Yang, L Ye, G Cheng, J Xu, Y Jin, Q Zhang, ... arXiv preprint arXiv:2203.16844, 2022 | 43 | 2022 |
Semi-supervised DNN training in meeting recognition P Zhang, Y Liu, T Hain 2014 IEEE Spoken Language Technology Workshop (SLT), 141-146, 2014 | 39 | 2014 |
Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models K Deng, Z Yang, S Watanabe, Y Higuchi, G Cheng, P Zhang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 33 | 2022 |
Pcf: Ecapa-tdnn with progressive channel fusion for speaker verification Z Zhao, Z Li, W Wang, P Zhang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 32 | 2023 |
Improving CTC-based speech recognition via knowledge transferring from pre-trained language models K Deng, S Cao, Y Zhang, L Ma, G Cheng, J Xu, P Zhang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 32 | 2022 |
Self-attention based prosodic boundary prediction for chinese speech synthesis C Lu, P Zhang, Y Yan ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 32 | 2019 |
Attention-Based LSTM with Multi-Task Learning for Distant Speech Recognition. Y Zhang, P Zhang, Y Yan Interspeech, 3857-3861, 2017 | 32 | 2017 |
Deep convolutional neural network with scalogram for audio scene modeling H Chen, P Zhang, H Bai, Q Yuan, X Bao, Y Yan Proc. Interspeech 2018, 3304-3308, 2018 | 30 | 2018 |
Emilia: An extensive, multilingual, and diverse speech dataset for large-scale speech generation H He, Z Shang, C Wang, X Li, Y Gu, H Hua, L Liu, C Yang, J Li, P Shi, ... 2024 IEEE Spoken Language Technology Workshop (SLT), 885-890, 2024 | 26 | 2024 |
Pre-training transformer decoder for end-to-end asr model with unpaired text data C Gao, G Cheng, R Yang, H Zhu, P Zhang, Y Yan ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 24 | 2021 |
Beam-guided TasNet: An iterative speech separation framework with multi-channel output H Chen, Y Yi, D Feng, P Zhang arXiv preprint arXiv:2102.02998, 2021 | 22 | 2021 |
Multi-accent adaptation based on gate mechanism H Zhu, L Wang, P Zhang, Y Yan arXiv preprint arXiv:2011.02774, 2020 | 22 | 2020 |
Incorporating Cross-Speaker Style Transfer for Multi-Language Text-to-Speech. Z Shang, Z Huang, H Zhang, P Zhang, Y Yan Interspeech, 1619-1623, 2021 | 20 | 2021 |