关注
Wei Li
Wei Li
Bytedance
在 bytedance.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Salmonn: Towards generic hearing abilities for large language models
C Tang, W Yu, G Sun, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang
arXiv preprint arXiv:2310.13289, 2023
2192023
Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling
W Li, SM Siniscalchi, NF Chen, CH Lee
2016 IEEE international conference on acoustics, speech and signal …, 2016
1092016
Llava-next-interleave: Tackling multi-image, video, and 3d in large multimodal models
F Li, R Zhang, H Zhang, Y Zhang, B Li, W Li, Z Ma, C Li
arXiv preprint arXiv:2407.07895, 2024
1002024
Connecting speech encoder and large language model for asr
W Yu, C Tang, G Sun, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
452024
Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models.
W Li, NF Chen, SM Siniscalchi, CH Lee
Interspeech, 2759-2763, 2017
442017
Video instruction tuning with synthetic data
Y Zhang, J Wu, W Li, B Li, Z Ma, Z Liu, C Li
arXiv preprint arXiv:2410.02713, 2024
382024
Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees.
W Li, K Li, SM Siniscalchi, NF Chen, CH Lee
Interspeech 2016, 3127-3131, 2016
332016
Improving mandarin tone recognition based on dnn by combining acoustic and articulatory features using extended recognition networks
J Lin, W Li, Y Gao, Y Xie, NF Chen, SM Siniscalchi, J Zhang, CH Lee
Journal of Signal Processing Systems 90, 1077-1087, 2018
302018
Improving mispronunciation detection of mandarin tones for non-native learners with soft-target tone labels and BLSTM-based deep tone models
W Li, NF Chen, SM Siniscalchi, CH Lee
IEEE/ACM Transactions on Audio, Speech, and Language Processing 27 (12 …, 2019
292019
A cross-task transfer learning approach to adapting deep speech enhancement models to unseen background noise using paired senone classifiers
S Wang, W Li, SM Siniscalchi, CH Lee
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
242020
A study on functional loads of phonetic contrasts under context based on mutual information of Chinese text and phonemes
J Zhang, W Li, Y Hou, W Cao, Z Xiong
2010 7th International Symposium on Chinese Spoken Language Processing, 194-198, 2010
232010
Improving audio-visual speech recognition performance with cross-modal student-teacher training
W Li, S Wang, M Lei, SM Siniscalchi, CH Lee
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
212019
Improving accent conversion with reference encoder and end-to-end text-to-speech
W Li, B Tang, X Yin, Y Zhao, W Li, K Wang, H Huang, Y Wang, Z Ma
arXiv preprint arXiv:2005.09271, 2020
152020
video-salmonn: Speech-enhanced audio-visual large language models
G Sun, W Yu, C Tang, X Chen, T Tan, W Li, L Lu, Z Ma, Y Wang, C Zhang
arXiv preprint arXiv:2406.15704, 2024
142024
Fine-grained audio-visual joint representations for multimodal large language models
G Sun, W Yu, C Tang, X Chen, T Tan, W Li, L Lu, Z Ma, C Zhang
arXiv preprint arXiv:2310.05863, 2023
112023
An ASR-free fluency scoring approach with self-supervised learning
W Liu, K Fu, X Tian, S Shi, W Li, Z Ma, T Lee
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
112023
Improving non-native word-level pronunciation scoring with phone-level mixup data augmentation and multi-source information
K Fu, S Gao, K Wang, W Li, X Tian, Z Ma
arXiv preprint arXiv:2203.01826, 2022
102022
A transfer and multi-task learning based approach for MOS prediction
X Tian, K Fu, S Gao, Y Gu, K Wang, W Li, Z Ma
Proc. Interspeech 2022, 5438-5442, 2022
102022
Improving mandarin tone mispronunciation detection for non-native learners with soft-target tone labels and blstm-based deep models
W Li, NF Chen, SM Siniscalchi, CH Lee
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
92018
Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring.
K Fu, S Gao, X Tian, W Li, Z Ma, A Bytedance
INTERSPEECH, 4337-4341, 2022
82022
系统目前无法执行此操作,请稍后再试。
文章 1–20