Adavit: Adaptive vision transformers for efficient image recognition L Meng, H Li, BC Chen, S Lan, Z Wu, YG Jiang, SN Lim CVPR 2022, 2022 | 260 | 2022 |
To see is to believe: Prompting gpt-4v for better visual instruction tuning J Wang, L Meng, Z Weng, B He, Z Wu, YG Jiang arXiv preprint arXiv:2311.07574, 2023 | 82 | 2023 |
Detection hub: Unifying object detection datasets via query adaptation on language embedding L Meng, X Dai, Y Chen, P Zhang, D Chen, M Liu, J Wang, Z Wu, L Yuan, ... CVPR 2023, 2023 | 26 | 2023 |
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs L Meng, J Yang, R Tian, X Dai, Z Wu, J Gao, YG Jiang NeurIPS 2024, 2024 | 10 | 2024 |
Learning from rich semantics and coarse locations for long-tailed object detection L Meng, X Dai, J Yang, D Chen, Y Chen, M Liu, YL Chen, Z Wu, L Yuan, ... NeurIPS 2023, 2023 | 9 | 2023 |
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation L Meng, S Lan, H Li, JM Alvarez, Z Wu, YG Jiang ECCV 2024, 2024 | 8 | 2024 |
A multimodal framework for video ads understanding Z Weng, L Meng, R Wang, Z Wu, YG Jiang Proceedings of the 29th ACM International Conference on Multimedia, 4843-4847, 2021 | 3 | 2021 |
FOCUS: Towards Universal Foreground Segmentation Z You, L Kong, L Meng, Z Wu AAAI 2025, 2025 | 1 | 2025 |
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results J Wang, Y Zang, P Zhang, T Chu, Y Cao, Z Sun, Z Liu, X Dong, T Wu, ... arXiv preprint arXiv:2406.11739, 2024 | 1 | 2024 |
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection Y Chen, W Yao, L Meng, S Wu, Z Wu, YG Jiang AAAI 2025, 2024 | | 2024 |
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning W Peng, L Meng, Y Chen, Y Xie, Y Liu, T Gui, H Xu, X Qiu, Z Wu, YG Jiang arXiv preprint arXiv:2412.03565, 2024 | | 2024 |