gao zhifu

Citeras av

	Alla	Sedan 2020
Citat	709	701
h-index	14	14
i10-index	15	15

380

190

285

20192020202120222023202420256 51 51 47 79 380 93

Offentlig åtkomst

Visa alla

3 artiklar

0 artiklar

tillgänglig

inte tillgänglig

Enligt krav från finansiärer

Medförfattare

ShiLiang ZhangSpeechLab，AlibabaVerifierad e-postadress på mail.ustc.edu.cn
Ian McLoughlinProfessor Singapore Institute of Technology (Singapore) and USTC (China)Verifierad e-postadress på singaporetech.edu.sg
Ziyang MaShanghai Jiao Tong UniversityVerifierad e-postadress på sjtu.edu.cn
Zhi-Jie YaniDST, Alibaba Inc.Verifierad e-postadress på alibaba-inc.com
Zhihao DuAlibabaVerifierad e-postadress på alibaba-inc.com
Qian Chen (陈谦)Alibaba GroupVerifierad e-postadress på alibaba-inc.com
Shi XianSpeech Lab, AlibabaVerifierad e-postadress på alibaba-inc.com
Fan YuNorthwestern Polytechnical UniversityVerifierad e-postadress på mail.nwpu.edu.cn
Guanrou YangShanghai Jiao Tong UniversityVerifierad e-postadress på sjtu.edu.cn

Följ

gao zhifu

Tongyi Lab, Alibaba Group

Verifierad e-postadress på alibaba-inc.com

LLM Speech Recognition Speaker Recognition Language Identification


Titel Sortera efter citat Sortera efter år Sortera efter titel	Citeras av Citeras av	År
Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition Z Gao, S Zhang, I McLoughlin, Z Yan arXiv preprint arXiv:2206.08317, 2022	102	2022
Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System Z Gao, Y Song, IV McLoughlin, P Li, Y Jiang, LR Dai INTERSPEECH 2019, 361-365, 2019	88	2019
emotion2vec: Self-supervised pre-training for speech emotion representation Z Ma, Z Zheng, J Ye, J Li, Z Gao, S Zhang, X Chen arXiv preprint arXiv:2312.15185, 2023	82	2023
Lauragpt: Listen, attend, understand, and regenerate audio with gpt Z Du, J Wang, Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, ... arXiv preprint arXiv:2310.04673, 2023	68	2023
Cosyvoice: A scalable multilingual zero-shot text-to-speech synthesizer based on supervised semantic tokens Z Du, Q Chen, S Zhang, K Hu, H Lu, Y Yang, H Hu, S Zheng, Y Gu, Z Ma, ... arXiv preprint arXiv:2407.05407, 2024	63	2024
FunASR: A Fundamental End-to-End Speech Recognition Toolkit Z Gao, Z Li, J Wang, H Luo, X Shi, M Chen, Y Li, L Zuo, Z Du, Z Xiao, ... INERSPEECH 2023, 2023	53	2023
San-m: Memory equipped self-attention for end-to-end speech recognition Z Gao, S Zhang, M Lei, I McLoughlin INTERSPEECH 2020, 6-10, 2020	36	2020
An Effective Deep Embedding Learning Architecture for Speaker Verification Y Jiang, Y Song, IV McLoughlin, Z Gao, LR Dai INTERSPEECH 2019, 4040-4044, 2019	36	2019
Streaming chunk-aware multihead attention for online end-to-end speech recognition S Zhang, Z Gao, H Luo, M Lei, J Gao, Z Yan, L Xie INTERSPEECH 2020, 2142-2146, 2020	32	2020
An improved deep embedding learning method for short duration speaker verification Z Gao, Y Song, IV McLoughlin, W Guo, LR Dai INTERSPEECH 2018, 3578-3582, 2018	32	2018
An embarrassingly simple approach for LLM with strong ASR capacity Z Ma, G Yang, Y Yang, Z Gao, J Wang, Z Du, F Yu, Q Chen, S Zheng, ... arXiv preprint arXiv:2402.08846, 2024	26	2024
Funaudiollm: Voice understanding and generation foundation models for natural interaction between humans and llms K An, Q Chen, C Deng, Z Du, C Gao, Z Gao, Y Gu, T He, H Hu, K Hu, S Ji, ... arXiv preprint arXiv:2407.04051, 2024	25	2024
Extremely Low Footprint End-to-End ASR System for Smart Device Z Gao, Y Yao, S Zhang, J Yang, M Lei, I McLoughlin INTERSPEECH 2021, 4548-4552, 2021	16	2021
Universal asr: Unifying streaming and non-streaming asr using a single encoder-decoder model Z Gao, S Zhang, M Lei, I McLoughlin arXiv preprint arXiv:2010.14099, 2020	16	2020
Seaco-paraformer: A non-autoregressive asr system with flexible and effective hotword customization ability X Shi, Y Yang, Z Li, Y Chen, Z Gao, S Zhang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	13	2024
Mala-asr: Multimedia-assisted llm-based asr G Yang, Z Ma, F Yu, Z Gao, S Zhang, X Chen arXiv preprint arXiv:2406.05839, 2024	8	2024
Cosyvoice 2: Scalable streaming speech synthesis with large language models Z Du, Y Wang, Q Chen, X Shi, X Lv, T Zhao, Z Gao, Y Yang, C Gao, ... arXiv preprint arXiv:2412.10117, 2024	6	2024
Minmo: A multimodal large language model for seamless voice interaction Q Chen, Y Chen, Y Chen, M Chen, Y Chen, C Deng, Z Du, R Gao, C Gao, ... arXiv preprint arXiv:2501.06282, 2025	2	2025
Wav2vec‐MoE: An unsupervised pre‐training and adaptation method for multi‐accent ASR Y Lin, S Zhang, Z Gao, L Wang, Y Yang, J Dang Electronics Letters 59 (11), e12823, 2023	2	2023
CTC-Assisted LLM-Based Contextual ASR G Yang, Z Ma, Z Gao, S Zhang, X Chen 2024 IEEE Spoken Language Technology Workshop (SLT), 126-131, 2024	1	2024

Systemet kan inte utföra åtgärden just nu. Försök igen senare.

Artiklar 1–20

Citat per år

Dubblettcitat

Sammanfogade citat

Lägg till medförfattareMedförfattare

Följ

Citeras av

Medförfattare