Llama-adapter: Efficient fine-tuning of language models with zero-init attention R Zhang, J Han, D Liu, P Gao, A Zhou, X Hu, S Yan, P Lu, H Li, Y Qiao arXiv preprint arXiv:2303.16199, 2023 | 728 | 2023 |
Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models Z Lin, D Liu, R Zhang, P Gao, L Qiu, H Xiao, H Qiu, C Lin, W Shao, ... arXiv preprint arXiv:2311.07575, 2023 | 219 | 2023 |
Imagebind-llm: Multi-modality instruction tuning J Han, R Zhang, W Shao, P Gao, P Xu, H Xiao, K Zhang, C Liu, S Wen, ... arXiv preprint arXiv:2309.03905, 2023 | 107 | 2023 |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ... Forty-first International Conference on Machine Learning, 2024 | 93 | 2024 |
Lumina-mgpt: Illuminate flexible photorealistic text-to-image generation with multimodal generative pretraining D Liu, S Zhao, L Zhuo, W Lin, Y Qiao, H Li, P Gao arXiv preprint arXiv:2408.02657, 2024 | 26 | 2024 |
Function-consistent feature distillation D Liu, M Kan, S Shan, X Chen ICLR 2023, 2023 | 20 | 2023 |
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers P Gao, L Zhuo, Z Lin, C Liu, J Chen, R Du, E Xie, X Luo, L Qiu, Y Zhang, ... arXiv preprint arXiv:2405.05945, 2024 | 11 | 2024 |
Seamless 3D surround view with a novel burger model L Zhang, J Chen, D Liu, Y Shen, S Zhao 2019 IEEE International Conference on Image Processing (ICIP), 4150-4154, 2019 | 11 | 2019 |
Venhancer: Generative space-time enhancement for video generation J He, T Xue, D Liu, X Lin, P Gao, D Lin, Y Qiao, W Ouyang, Z Liu arXiv preprint arXiv:2407.07667, 2024 | 9 | 2024 |
Lumina-next: Making lumina-t2x stronger and faster with next-dit L Zhuo, R Du, H Xiao, Y Li, D Liu, R Huang, W Liu, L Zhao, FY Wang, ... arXiv preprint arXiv:2406.18583, 2024 | 4 | 2024 |
Triplet knowledge distillation X Wang, D Liu, M Kan, C Han, Z Wu, S Shan arXiv preprint arXiv:2305.15975, 2023 | 4 | 2023 |
A Simple Romance Between Multi-Exit Vision Transformer and Token Reduction D Liu, M Kan, S Shan, C Xilin The Twelfth International Conference on Learning Representations, 2024 | 3* | 2024 |
I-max: Maximize the resolution potential of pre-trained rectified flow transformers with projected flow R Du, D Liu, L Zhuo, Q Qi, H Li, Z Ma, P Gao arXiv preprint arXiv:2410.07536, 2024 | 1 | 2024 |