Popmag: Pop music accompaniment generation Y Ren, J He, X Tan, T Qin, Z Zhao, TY Liu Proceedings of the 28th ACM international conference on multimedia, 1198-1206, 2020 | 137 | 2020 |
Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis Z Ye, Z Jiang, Y Ren, J Liu, J He, Z Zhao arXiv preprint arXiv:2301.13430, 2023 | 118 | 2023 |
M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus L Zhang, R Li, S Wang, L Deng, J Liu, Y Ren, J He, R Huang, J Zhu, ... Advances in Neural Information Processing Systems 35, 6914-6926, 2022 | 78 | 2022 |
Qwen2-audio technical report Y Chu, J Xu, Q Yang, H Wei, X Wei, Z Guo, Y Leng, Y Lv, J He, J Lin, ... arXiv preprint arXiv:2407.10759, 2024 | 76 | 2024 |
Qwen2 technical report, 2024 A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... URL https://arxiv. org/abs/2407.10671, 0 | 50 | |
Transpeech: Speech-to-speech translation with bilateral perturbation R Huang, J Liu, H Liu, Y Ren, L Zhang, J He, Z Zhao arXiv preprint arXiv:2205.12523, 2022 | 47 | 2022 |
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao arXiv preprint arXiv:2305.00787, 2023 | 34 | 2023 |
Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts Z Jiang, J Liu, Y Ren, J He, C Zhang, Z Ye, P Wei, C Wang, X Yin, Z Ma, ... arXiv preprint arXiv:2307.07218, 2023 | 30 | 2023 |
Real3d-portrait: One-shot realistic 3d talking portrait synthesis Z Ye, T Zhong, Y Ren, J Yang, W Li, J Huang, Z Jiang, J He, R Huang, ... arXiv preprint arXiv:2401.08503, 2024 | 27 | 2024 |
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis Z Jiang, J Liu, Y Ren, J He, Z Ye, S Ji, Q Yang, C Zhang, P Wei, C Wang, ... The Twelfth International Conference on Learning Representations, 2024 | 25 | 2024 |
Clapspeech: Learning prosody from text context with contrastive language-audio pre-training Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao arXiv preprint arXiv:2305.10763, 2023 | 19 | 2023 |
RMSSinger: realistic-music-score based singing voice synthesis J He, J Liu, Z Ye, R Huang, C Cui, H Liu, Z Zhao arXiv preprint arXiv:2305.10686, 2023 | 19 | 2023 |
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis Y Zhang, R Huang, R Li, JZ He, Y Xia, F Chen, X Duan, B Huai, Z Zhao Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19597 …, 2024 | 16 | 2024 |
Av-transpeech: Audio-visual robust speech-to-speech translation R Huang, H Liu, X Cheng, Y Ren, L Li, Z Ye, J He, L Zhang, J Liu, X Yin, ... arXiv preprint arXiv:2305.15403, 2023 | 14 | 2023 |
Vit-tts: visual text-to-speech with scalable diffusion transformer H Liu, R Huang, X Lin, W Xu, M Zheng, H Chen, J He, Z Zhao arXiv preprint arXiv:2305.12708, 2023 | 14 | 2023 |
Flow-based unconstrained lip to speech generation J He, Z Zhao, Y Ren, J Liu, B Huai, N Yuan Proceedings of the AAAI Conference on Artificial Intelligence 36 (1), 843-851, 2022 | 13 | 2022 |
Qwen2 technical report. CoRR, abs/2407.10671, 2024. doi: 10.48550 A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ... arXiv preprint ARXIV.2407.10671, 0 | 8 | |
Unisinger: Unified end-to-end singing voice synthesis with cross-modality information matching Z Hong, C Cui, R Huang, L Zhang, J Liu, J He, Z Zhao Proceedings of the 31st ACM International Conference on Multimedia, 7569-7579, 2023 | 7 | 2023 |
Boosting prompting mechanisms for zero-shot speech synthesis Z Jiang, J Liu, Y Ren, J He, Z Ye, S Ji, Q Yang, C Zhang, P Wei, C Wang, ... The Twelfth International Conference on Learning Representations, 2023 | 7 | 2023 |
DualSign: Semi-Supervised Sign Language Production with Balanced Multi-Modal Multi-Task Dual Transformation W Huang, Z Zhao, J He, M Zhang Proceedings of the 30th ACM International Conference on Multimedia, 5486-5495, 2022 | 6 | 2022 |