Le Zhuo

Citée par

	Toutes	Depuis 2020
Citations	288	287
indice h	10	9
indice i10	10	9

260

130

195

20232024202515 250 22

Accès public

Tout afficher

4 articles

0 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Coauteurs

Peng GaoShanghai AI LabAdresse e-mail validée de pjlab.org.cn
si liuBeihang UniversityAdresse e-mail validée de buaa.edu.cn
Yue LiaoThe Chinese University of Hong KongAdresse e-mail validée de cuhk.edu.hk
Hongsheng Li (李鸿升)The Chinese University of Hong KongAdresse e-mail validée de ee.cuhk.edu.hk
Jie FuShanghai AI LabAdresse e-mail validée de lisa.iro.umontreal.ca
Shuicheng Yan, Fellow of AAAI, ACM,...National University of Singapore, Ex: Sea AI Lab, Skywork AI | Looking for labmatesAdresse e-mail validée de nus.edu.sg
Jian Tang (唐建）Associate Professor, Mila-Quebec AI Institute, HEC Montréal, Canada CIFAR AI ChairAdresse e-mail validée de hec.ca

Suivre

Le Zhuo

Shanghai AI Lab

Adresse e-mail validée de pjlab.org.cn - Page d'accueil

generative models multi-modal learning


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Graphtext: Graph reasoning in text space J Zhao, L Zhuo, Y Shen, M Qu, K Liu, M Bronstein, Z Zhu, J Tang arXiv preprint arXiv:2310.01089, 2023	57	2023
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers P Gao, L Zhuo, Z Lin, C Liu, J Chen, R Du, E Xie, X Luo, L Qiu, Y Zhang, ... arXiv preprint arXiv:2405.05945, 2024	54*	2024
Video background music generation: Dataset, method and evaluation L Zhuo, Z Wang, B Wang, Y Liao, C Bao, S Peng, S Han, A Zhang, F Fang, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	28	2023
Lumina-mgpt: Illuminate flexible photorealistic text-to-image generation with multimodal generative pretraining D Liu, S Zhao, L Zhuo, W Lin, Y Qiao, H Li, P Gao arXiv preprint arXiv:2408.02657, 2024	26	2024
Marble: Music audio representation benchmark for universal evaluation R Yuan, Y Ma, Y Li, G Zhang, X Chen, H Yin, Y Liu, J Huang, Z Tian, ... Advances in Neural Information Processing Systems 36, 39626-39647, 2023	23	2023
Lyricwhiz: Robust multilingual lyrics transcription by whispering to chatgpt L Zhuo, R Yuan, J Pan, Y Ma, Y Li, G Zhang, S Liu, R Dannenberg, J Fu, ... International Society for Music Information Retrieval Conference (ISMIR), 2023	22*	2023
Diffdance: Cascaded human motion diffusion model for dance generation Q Qi, L Zhuo, A Zhang, Y Liao, F Fang, S Liu, S Yan Proceedings of the 31st ACM International Conference on Multimedia, 1374-1382, 2023	19	2023
Lumina-next: Making lumina-t2x stronger and faster with next-dit L Zhuo, R Du, H Xiao, Y Li, D Liu, R Huang, W Liu, L Zhao, FY Wang, ... arXiv preprint arXiv:2406.18583, 2024	17*	2024
Protllm: An interleaved protein-language llm with protein-as-word pre-training L Zhuo, Z Chi, M Xu, H Huang, H Zheng, C He, XL Mao, W Zhang arXiv preprint arXiv:2403.07920, 2024	14	2024
Llms as visual explainers: Advancing image classification with evolving visual descriptions S Han, L Zhuo, Y Liao, S Liu arXiv preprint arXiv:2311.11904, 2023	10	2023
Llava-mod: Making llava tiny via moe knowledge distillation F Shu, Y Liao, L Zhuo, C Xu, L Zhang, G Zhang, H Shi, L Chen, T Zhong, ... arXiv preprint arXiv:2408.15881, 2024	9	2024
Customize your visual autoregressive recipe with set autoregressive modeling W Liu, L Zhuo, Y Xin, S Xia, P Gao, X Yue arXiv preprint arXiv:2410.10511, 2024	5	2024
PixWizard: Versatile image-to-image visual assistant with open-language instructions W Lin, X Wei, R Zhang, L Zhuo, S Zhao, S Huang, J Xie, Y Qiao, P Gao, ... arXiv preprint arXiv:2409.15278, 2024	3	2024
I-max: Maximize the resolution potential of pre-trained rectified flow transformers with projected flow R Du, D Liu, L Zhuo, Q Qi, H Li, Z Ma, P Gao arXiv preprint arXiv:2410.07536, 2024	1	2024
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models J Lei, R Zhang, X Hu, W Lin, Z Li, W Sun, R Du, L Zhuo, Z Li, X Li, S Zhao, ... arXiv preprint arXiv:2501.13920, 2025		2025
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation B Wang, L Zhuo, Z Wang, C Bao, W Chengjing, X Nie, J Dai, J Han, ... arXiv preprint arXiv:2412.09428, 2024		2024
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection S Han, W Huang, H Shi, L Zhuo, X Su, S Zhang, X Zhou, X Qi, Y Liao, ... arXiv preprint arXiv:2411.14794, 2024		2024

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–17

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs