Mm-llms: Recent advances in multimodal large language models D Zhang, Y Yu, J Dong, C Li, D Su, C Chu, D Yu arXiv preprint arXiv:2401.13601, 2024 | 205 | 2024 |
CBLDNN-based speaker-independent speech separation via generative adversarial training C Li, L Zhu, S Xu, P Gao, B Xu 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 50 | 2018 |
Single-channel Speech Dereverberation via Generative Adversarial Training C Li, T Wang, S Xu, B Xu Proc. Interspeech 2018, 2018 | 22 | 2018 |
Compression of acoustic model via knowledge distillation and pruning C Li, L Zhu, S Xu, P Gao, B Xu 2018 24th International conference on pattern recognition (ICPR), 2785-2790, 2018 | 14 | 2018 |
Multi-task audio source separation L Zhang, C Li, F Deng, X Wang 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 11 | 2021 |
One-shot voice conversion based on speaker aware module Y Zhang, H Che, J Li, C Li, X Wang, Z Wang ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 10 | 2021 |
Ead-conformer: a conformer-based encoder-attention-decoder-network for multi-task audio source separation C Li, Y Wang, F Deng, Z Zhang, X Wang, Z Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 8 | 2022 |
Speaker and direction inferred dual-channel speech separation C Li, J Xu, N Mesgarani, B Xu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 8 | 2021 |
Video-to-audio generation with hidden alignment M Xu, C Li, X Tu, Y Ren, R Chen, Y Gu, W Liang, D Yu arXiv preprint arXiv:2407.07464, 2024 | 5 | 2024 |
Recurrent neural network based small-footprint wake-up-word speech recognition system with a score calibration method C Li, L Zhu, S Xu, P Gao, B Xu 2018 24th International Conference on Pattern Recognition (ICPR), 3222-3227, 2018 | 4 | 2018 |
Ezaudio: Enhancing text-to-audio generation with efficient diffusion transformer J Hai, Y Xu, H Zhang, C Li, H Wang, M Elhilali, D Yu arXiv preprint arXiv:2409.10819, 2024 | 3 | 2024 |
The ZTSpeech system for CHiME-5 challenge: A far-field speech recognition system with front-end and robust back-end C Li, T Wang Proc. of The 5th Intl. Workshop on Speech Processing in Everyday …, 2018 | 3 | 2018 |
STA-V2A: Video-to-audio generation with semantic and temporal alignment Y Ren, C Li, M Xu, W Liang, Y Gu, R Chen, D Yu arXiv preprint arXiv:2409.08601, 2024 | 2 | 2024 |
Conformer Space Neural Architecture Search for Multi-Task Audio Separation S Lu, Y Wang, P Yao, C Li, J Tan, F Deng, X Wang, C Song Proc. Interspeech 2022, 5358-6362, 2022 | 2 | 2022 |
Video-to-Audio Generation with Fine-grained Temporal Semantics Y Hu, Y Gu, C Li, R Chen, D Yu arXiv preprint arXiv:2409.14709, 2024 | 1 | 2024 |
Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0 Z Wang, R Fu, Z Wen, J Tao, X Wang, Y Xie, X Qi, S Shi, Y Lu, Y Liu, C Li, ... arXiv preprint arXiv:2409.11909, 2024 | 1 | 2024 |
Towards Diverse and Efficient Audio Captioning via Diffusion Models M Xu, C Li, X Tu, Y Ren, R Fu, W Liang, D Yu arXiv preprint arXiv:2409.09401, 2024 | 1 | 2024 |
Prompt-guided Precise Audio Editing with Diffusion Models M Xu, C Li, D Su, W Liang, D Yu arXiv preprint arXiv:2406.04350, 2024 | 1 | 2024 |
WA-Transformer: Window Attention-based Transformer with Two-stage Strategy for Multi-task Audio Source Separation Y Wang, C Li, F Deng, S Lu, P Yao, J Tan, C Song, X Wang Proc. Interspeech 2022, 5373-5377, 2022 | 1 | 2022 |
Rethinking Singing Voice Separation With Spectral-Temporal Transformer S Yu, C Li, F Deng, X Wang 2021 Asia-Pacific Signal and Information Processing Association Annual …, 2021 | 1 | 2021 |