Ikuti
Xiong Wang
Xiong Wang
Tencent, Speech Algorithm Engineer
Email yang diverifikasi di tencent.com
Judul
Dikutip oleh
Dikutip oleh
Tahun
Wenet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit
Z Yao, D Wu, X Wang, B Zhang, F Yu, C Yang, Z Peng, X Chen, L Xie, ...
arXiv preprint arXiv:2102.01547, 2021
2902021
Unified streaming and non-streaming two-pass end-to-end model for speech recognition
B Zhang, D Wu, Z Yao, X Wang, F Yu, C Yang, L Guo, Y Hu, L Xie, X Lei
arXiv preprint arXiv:2012.05481, 2020
822020
Adversarial examples for improving end-to-end attention-based small-footprint keyword spotting
X Wang, S Sun, C Shan, J Hou, L Xie, S Li, X Lei
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
462019
Vita: Towards open-source interactive omni multimodal llm
C Fu, H Lin, Z Long, Y Shen, M Zhao, Y Zhang, S Dong, X Wang, D Yin, ...
arXiv preprint arXiv:2408.05211, 2024
432024
Accent and speaker disentanglement in many-to-many voice conversion
Z Wang, W Ge, X Wang, S Yang, W Gan, H Chen, H Li, L Xie, X Li
2021 12th International Symposium on Chinese Spoken Language Processing …, 2021
372021
Cascade rnn-transducer: Syllable based streaming on-device mandarin speech recognition with a syllable-to-character converter
X Wang, Z Yao, X Shi, L Xie
2021 IEEE Spoken Language Technology Workshop (SLT), 15-21, 2021
322021
Efficient conformer with prob-sparse attention mechanism for end-to-endspeech recognition
X Wang, S Sun, L Xie, L Ma
arXiv preprint arXiv:2106.09236, 2021
232021
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines
F Yu, Z Yao, X Wang, K An, L Xie, Z Ou, B Liu, X Li, G Miao
2021 IEEE Spoken Language Technology Workshop (SLT), 1117-1123, 2021
212021
Virtual adversarial training for DS-CNN based small-footprint keyword spotting
X Wang, S Sun, L Xie
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019
122019
Freeze-omni: A smart and low latency speech-to-speech dialogue model with frozen llm
X Wang, Y Li, C Fu, Y Shen, L Xie, K Li, X Sun, L Ma
arXiv preprint arXiv:2411.00774, 2024
112024
Two stage contextual word filtering for context bias in unified streaming and non-streaming transducer
Z Yang, S Sun, X Wang, Y Zhang, L Ma, L Xie
arXiv preprint arXiv:2301.06735, 2023
102023
CaTT-KWS: a multi-stage customized keyword spotting framework based on cascaded transducer-transformer
Z Yang, S Sun, J Li, X Zhang, X Wang, L Ma, L Xie
arXiv preprint arXiv:2207.01267, 2022
102022
Vita-1.5: Towards gpt-4o level real-time vision and speech interaction
C Fu, H Lin, X Wang, YF Zhang, Y Shen, X Liu, Y Li, Z Long, H Gao, K Li, ...
arXiv preprint arXiv:2501.01957, 2025
52025
DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting
S Lv, X Wang, S Sun, L Ma, L Xie
arXiv preprint arXiv:2305.12331, 2023
52023
Ieee slt 2021 alpha-mini speech challenge: Open datasets, tracks, rules and baselines
Y Fu, Z Yao, W He, J Wu, X Wang, Z Yang, S Zhang, L Xie, D Huang, H Bu, ...
2021 IEEE Spoken Language Technology Workshop (SLT), 1101-1108, 2021
52021
Minimizing sequential confusion error in speech command recognition
Z Yang, H Lv, X Wang, A Zhang, L Xie
arXiv preprint arXiv:2207.01261, 2022
12022
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray
Y Shen, C Fu, S Dong, X Wang, P Chen, M Zhang, H Cao, K Li, X Zheng, ...
arXiv preprint arXiv:2502.05177, 2025
2025
LUCY: Linguistic Understanding and Control Yielding Early Stage of Her
H Gao, H Shao, X Wang, C Qiu, Y Shen, S Cai, Y Shi, Z Xu, Z Long, ...
arXiv preprint arXiv:2501.16327, 2025
2025
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
Y Li, X Wang, S Cao, Y Zhang, L Ma, L Xie
arXiv preprint arXiv:2408.09491, 2024
2024
Sistem tidak dapat melakukan operasi ini. Coba lagi nanti.
Artikel 1–19