Multi-task collaborative network for joint referring expression comprehension and segmentation G Luo, Y Zhou, X Sun, L Cao, C Wu, C Deng, R Ji Proceedings of the IEEE/CVF Conference on computer vision and pattern …, 2020 | 327 | 2020 |
Improving image captioning by leveraging intra-and inter-layer global representation in transformer network J Ji, Y Luo, X Sun, F Chen, G Luo, Y Wu, Y Gao, R Ji Proceedings of the AAAI conference on artificial intelligence 35 (2), 1655-1663, 2021 | 191 | 2021 |
Seqtr: A simple yet universal network for visual grounding C Zhu, Y Zhou, Y Shen, G Luo, X Pan, M Lin, C Chen, L Cao, X Sun, R Ji European Conference on Computer Vision, 598-615, 2022 | 152 | 2022 |
Cascade grouped attention network for referring expression segmentation G Luo, Y Zhou, R Ji, X Sun, J Su, CW Lin, Q Tian Proceedings of the 28th ACM International Conference on Multimedia, 1274-1282, 2020 | 133 | 2020 |
Cheap and quick: Efficient vision-language instruction tuning for large language models G Luo, Y Zhou, T Ren, S Chen, X Sun, R Ji NeurIPS 2023, 2023 | 107 | 2023 |
Towards efficient visual adaption via structural re-parameterization G Luo, M Huang, Y Zhou, X Sun, G Jiang, Z Wang, R Ji arXiv preprint arXiv:2302.08106, 2023 | 77 | 2023 |
Active teacher for semi-supervised object detection P Mi, J Lin, Y Zhou, Y Shen, G Luo, X Sun, L Cao, R Fu, Q Xu, R Ji Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 73 | 2022 |
A real-time global inference network for one-stage referring expression comprehension Y Zhou, R Ji, G Luo, X Sun, J Su, X Ding, CW Lin, Q Tian IEEE Transactions on Neural Networks and Learning Systems 34 (1), 134-143, 2021 | 71 | 2021 |
Towards lightweight transformer via group-wise transformation for vision-and-language tasks G Luo, Y Zhou, X Sun, Y Wang, L Cao, Y Wu, F Huang, R Ji IEEE Transactions on Image Processing 31, 3386-3398, 2022 | 49 | 2022 |
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models G Luo, Y Zhou, Y Zhang, X Zheng, X Sun, R Ji arXiv preprint arXiv:2403.03003, 2024 | 45 | 2024 |
K-armed bandit based multi-modal network architecture search for visual question answering Y Zhou, R Ji, X Sun, G Luo, X Hong, J Su, X Ding, L Shao Proceedings of the 28th ACM international conference on multimedia, 1245-1254, 2020 | 26 | 2020 |
Multi-branch distance-sensitive self-attention network for image captioning J Ji, X Huang, X Sun, Y Zhou, G Luo, L Cao, J Liu, L Shao, R Ji IEEE Transactions on Multimedia 25, 3962-3974, 2022 | 20 | 2022 |
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension L Jin*, G Luo*, Y Zhou, X Sun, G Jiang, A Shu, R Ji Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 18 | 2023 |
3d-stmn: Dependency-driven superpoint-text matching network for end-to-end 3d referring expression segmentation C Wu, Y Ma, Q Chen, H Wang, G Luo, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5940-5948, 2024 | 17 | 2024 |
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension G Luo, Y Zhou, J Sun, X Sun, R Ji IEEE Transactions on Multimedia, 2023 | 17* | 2023 |
Towards language-guided visual recognition via dynamic convolutions G Luo, Y Zhou, X Sun, Y Wu, Y Gao, R Ji International Journal of Computer Vision 132 (1), 1-19, 2024 | 16 | 2024 |
Refteacher: A strong baseline for semi-supervised referring expression comprehension J Sun, G Luo, Y Zhou, X Sun, G Jiang, Z Wang, R Ji Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 15 | 2023 |
Mono-internvl: Pushing the boundaries of monolithic multimodal large language models with endogenous visual pre-training G Luo, X Yang, W Dou, Z Wang, J Dai, Y Qiao, X Zhu arXiv preprint arXiv:2410.08202, 2024 | 11 | 2024 |
Controlmllm: Training-free visual prompt learning for multimodal large language models M Wu, X Cai, J Ji, J Li, O Huang, G Luo, H Fei, G Jiang, X Sun, R Ji arXiv preprint arXiv:2407.21534, 2024 | 8 | 2024 |
Towards end-to-end semi-supervised learning for one-stage object detection G Luo, Y Zhou, L Jin, X Sun, R Ji arXiv preprint arXiv:2302.11299, 2023 | 5 | 2023 |