Suivre
Le Zhuo
Le Zhuo
Shanghai AI Lab
Adresse e-mail validée de pjlab.org.cn - Page d'accueil
Titre
Citée par
Citée par
Année
Graphtext: Graph reasoning in text space
J Zhao, L Zhuo, Y Shen, M Qu, K Liu, M Bronstein, Z Zhu, J Tang
arXiv preprint arXiv:2310.01089, 2023
572023
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
P Gao, L Zhuo, Z Lin, C Liu, J Chen, R Du, E Xie, X Luo, L Qiu, Y Zhang, ...
arXiv preprint arXiv:2405.05945, 2024
54*2024
Video background music generation: Dataset, method and evaluation
L Zhuo, Z Wang, B Wang, Y Liao, C Bao, S Peng, S Han, A Zhang, F Fang, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
282023
Lumina-mgpt: Illuminate flexible photorealistic text-to-image generation with multimodal generative pretraining
D Liu, S Zhao, L Zhuo, W Lin, Y Qiao, H Li, P Gao
arXiv preprint arXiv:2408.02657, 2024
262024
Marble: Music audio representation benchmark for universal evaluation
R Yuan, Y Ma, Y Li, G Zhang, X Chen, H Yin, Y Liu, J Huang, Z Tian, ...
Advances in Neural Information Processing Systems 36, 39626-39647, 2023
232023
Lyricwhiz: Robust multilingual lyrics transcription by whispering to chatgpt
L Zhuo, R Yuan, J Pan, Y Ma, Y Li, G Zhang, S Liu, R Dannenberg, J Fu, ...
International Society for Music Information Retrieval Conference (ISMIR), 2023
22*2023
Diffdance: Cascaded human motion diffusion model for dance generation
Q Qi, L Zhuo, A Zhang, Y Liao, F Fang, S Liu, S Yan
Proceedings of the 31st ACM International Conference on Multimedia, 1374-1382, 2023
192023
Lumina-next: Making lumina-t2x stronger and faster with next-dit
L Zhuo, R Du, H Xiao, Y Li, D Liu, R Huang, W Liu, L Zhao, FY Wang, ...
arXiv preprint arXiv:2406.18583, 2024
17*2024
Protllm: An interleaved protein-language llm with protein-as-word pre-training
L Zhuo, Z Chi, M Xu, H Huang, H Zheng, C He, XL Mao, W Zhang
arXiv preprint arXiv:2403.07920, 2024
142024
Llms as visual explainers: Advancing image classification with evolving visual descriptions
S Han, L Zhuo, Y Liao, S Liu
arXiv preprint arXiv:2311.11904, 2023
102023
Llava-mod: Making llava tiny via moe knowledge distillation
F Shu, Y Liao, L Zhuo, C Xu, L Zhang, G Zhang, H Shi, L Chen, T Zhong, ...
arXiv preprint arXiv:2408.15881, 2024
92024
Customize your visual autoregressive recipe with set autoregressive modeling
W Liu, L Zhuo, Y Xin, S Xia, P Gao, X Yue
arXiv preprint arXiv:2410.10511, 2024
52024
PixWizard: Versatile image-to-image visual assistant with open-language instructions
W Lin, X Wei, R Zhang, L Zhuo, S Zhao, S Huang, J Xie, Y Qiao, P Gao, ...
arXiv preprint arXiv:2409.15278, 2024
32024
I-max: Maximize the resolution potential of pre-trained rectified flow transformers with projected flow
R Du, D Liu, L Zhuo, Q Qi, H Li, Z Ma, P Gao
arXiv preprint arXiv:2410.07536, 2024
12024
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
J Lei, R Zhang, X Hu, W Lin, Z Li, W Sun, R Du, L Zhuo, Z Li, X Li, S Zhao, ...
arXiv preprint arXiv:2501.13920, 2025
2025
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation
B Wang, L Zhuo, Z Wang, C Bao, W Chengjing, X Nie, J Dai, J Han, ...
arXiv preprint arXiv:2412.09428, 2024
2024
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
S Han, W Huang, H Shi, L Zhuo, X Su, S Zhang, X Zhou, X Qi, Y Liao, ...
arXiv preprint arXiv:2411.14794, 2024
2024
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–17