Qwen2. 5 technical report A Yang, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Li, D Liu, F Huang, ... arXiv preprint arXiv:2412.15115, 2024 | 959 | 2024 |
Prompttts: Controllable text-to-speech with text descriptions Z Guo, Y Leng, Y Wu, S Zhao, X Tan ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 102 | 2023 |
Qwen2-audio technical report Y Chu, J Xu, Q Yang, H Wei, X Wei, Z Guo, Y Leng, Y Lv, J He, J Lin, ... arXiv preprint arXiv:2407.10759, 2024 | 91 | 2024 |
Qwen2 technical report, 2024 A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... URL https://arxiv. org/abs/2407.10671, 0 | 66 | |
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... arXiv preprint arXiv:2309.02285, 2023 | 41 | 2023 |
Audio generation with multiple conditional diffusion model Z Guo, J Mao, R Tao, L Yan, K Ouchi, H Liu, X Wang Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 18153 …, 2024 | 13 | 2024 |
Qwen2 technical report. CoRR, abs/2407.10671, 2024. doi: 10.48550 A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... arXiv preprint ARXIV.2407.10671, 2024 | 9 | 2024 |
Advancing multi-grained alignment for contrastive language-audio pre-training Y Li, Z Guo, X Wang, H Liu Proceedings of the 32nd ACM International Conference on Multimedia, 7356-7365, 2024 | 3 | 2024 |
Qwen2-audio technical report, 2024 Y Chu, J Xu, Q Yang, H Wei, X Wei, Z Guo, Y Leng, Y Lv, J He, J Lin, ... URL https://arxiv. org/abs/2407.10759, 0 | 3 | |
Qwen2 technical report. arXivpreprint A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... arXiv preprint arXiv:2407.10671, 2024 | 2 | 2024 |
A hybrid system of sound event detection transformer and frame-wise model for DCASE 2022 Task 4 Y Li, Z Guo, Z Ye, X Wang, H Liu, Y Qian, R Tao, L Yan, K Ouchi arXiv preprint arXiv:2210.09529, 2022 | 2 | 2022 |
Qwen2 Technical Report, July 2024a A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... URL https://arxiv. org/abs/2407.10671 v4, 6, 0 | 2 | |
Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models W Liu, Z Guo, J Xu, Y Lv, Y Chu, Z Zhao, J Lin arXiv preprint arXiv:2409.19283, 2024 | 1 | 2024 |
Leveraging Language Model Capabilities for Sound Event Detection H Wang, J Mao, Z Guo, J Wan, H Liu, X Wang arXiv preprint arXiv:2308.11530, 2023 | 1 | 2023 |
Furnishing Sound Event Detection with Language Model Abilities H Wang, J Mao, Z Guo, J Wan, H Liu, X Wang CoRR, 2023 | 1 | 2023 |