Efficientvit: Lightweight multi-scale attention for high-resolution dense prediction H Cai, J Li, M Hu, C Gan, S Han Proceedings of the IEEE/CVF international conference on computer vision …, 2023 | 168* | 2023 |
Multiply: A multisensory object-centric embodied large language model in 3d world Y Hong, Z Zheng, P Chen, Y Wang, J Li, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 21 | 2024 |
Covlm: Composing visual entities and relationships in large language models via communicative decoding J Li, D Chen, Y Hong, Z Chen, P Chen, Y Shen, C Gan arXiv preprint arXiv:2311.03354, 2023 | 12 | 2023 |
Constraint-aware and ranking-distilled token pruning for efficient transformer inference J Li, LL Zhang, J Xu, Y Wang, S Yan, Y Xia, Y Yang, T Cao, H Sun, ... Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 11 | 2023 |
Flexattention for efficient high-resolution vision-language models J Li, D Chen, T Cai, P Chen, Y Hong, Z Chen, Y Shen, C Gan European Conference on Computer Vision, 286-302, 2024 | 6 | 2024 |
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences H Zhi, P Chen, J Li, S Ma, X Sun, T Xiang, Y Lei, M Tan, C Gan arXiv preprint arXiv:2412.01292, 2024 | | 2024 |