MME: A comprehensive evaluation benchmark for multimodal large language models C Fu, P Chen, Y Shen, Y Qin, M Zhang, X Lin, J Yang, X Zheng, K Li, ... arXiv preprint arXiv:2306.13394, 2023 | 1067* | 2023 |
CF-ViT: A General Coarse-to-Fine Method for Vision Transformer M Chen, M Lin, K Li, Y Shen, Y Wu, F Chao, R Ji Proceedings of the AAAI Conference on Artificial Intelligence (AAAI, Oral), 2022 | 300* | 2022 |
Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification F Yang, K Li, Z Zhong, Z Luo, X Sun, H Cheng, X Guo, F Huang, R Ji, S Li Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020 | 189 | 2020 |
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer Y Xu, Z Zhang, M Zhang, K Sheng, K Li, W Dong, L Zhang, C Xu, X Sun Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021 | 188 | 2021 |
Woodpecker: Hallucination correction for multimodal large language models S Yin, C Fu, S Zhao, T Xu, H Wang, D Sui, Y Shen, K Li, X Sun, E Chen arXiv preprint arXiv:2310.16045, 2023 | 165 | 2023 |
Toward an Expert Level of Lung Cancer Detection and Classification using a Deep Convolutional Neural Network C Zhang, X Sun, K Dang, K Li, X Guo, J Chang, Z Yu, F Huang, Y Wu, ... The Oncologist 24 (9), 1159-1165, 2019 | 152 | 2019 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis C Fu, Y Dai, Y Luo, L Li, S Ren, R Zhang, Z Wang, C Zhou, Y Shen, ... arXiv preprint arXiv:2405.21075, 2024 | 139 | 2024 |
Pruning Filter in Filter F Meng, H Cheng, K Li, H Luo, X Guo, G Lu, X Sun Advances in Neural Information Processing Systems (NeurIPS), 2020 | 138 | 2020 |
ISTR: End-to-End Instance Segmentation with Transformers J Hu, L Cao, Y Lu, SC Zhang, Y Wang, K Li, F Huang, L Shao, R Ji arXiv preprint arXiv:2105.00637, 2021 | 112 | 2021 |
Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning J Wang, Y Gao, K Li, Y Lin, AJ Ma, X Sun Computer Vision and Pattern Recognition (CVPR), 2020 | 109 | 2020 |
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining Y Gao, J Liu, Z Xu, J Zhang, K Li, R Ji, C Shen Advances in Neural Information Processing Systems (NeurIPS, Oral), 2022 | 100 | 2022 |
Training-free Transformer Architecture Search Q Zhou, K Sheng, X Zheng, K Li, X Sun, Y Tian, J Chen, R Ji Computer Vision and Pattern Recognition (CVPR, Oral), 2022 | 66 | 2022 |
DisCo: Remedying Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning Y Gao, JX Zhuang, S Lin, H Cheng, X Sun, K Li, C Shen European Conference on Computer Vision (ECCV, Oral), 237-253, 2022 | 65 | 2022 |
Learning Best Combination for Efficient N: M Sparsity Y Zhang, M Lin, Z Lin, Y Luo, K Li, F Chao, Y Wu, R Ji Advances in Neural Information Processing Systems (NeurIPS), 2022 | 62 | 2022 |
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion J Wang, Y Gao, K Li, X Jiang, X Guo, R Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020 | 62 | 2020 |
Semi-Supervised Adversarial Monocular Depth Estimation R Ji, K Li, Y Wang, X Sun, F Guo, X Guo, Y Wu, F Huang, J Luo IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 | 60 | 2019 |
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise C Fu, R Zhang, H Lin, Z Wang, T Gao, Y Luo, Y Huang, Z Zhang, L Qiu, ... arXiv preprint arXiv:2312.12436, 2023 | 51 | 2023 |
SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger Y Gao, J Liu, Z Xu, T Wu, W Liu, J Yang, K Li, X Sun Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023 | 43 | 2023 |
Long-Tailed Class Incremental Learning X Liu, YS Hu, XS Cao, AD Bagdanov, K Li, MM Cheng European Conference on Computer Vision (ECCV), 495-512, 2022 | 43 | 2022 |
Filter Grafting for Deep Neural Networks F Meng, H Cheng, K Li, Z Xu, R Ji, X Sun, G Lu Computer Vision and Pattern Recognition (CVPR), 2020 | 43 | 2020 |