MMICL: Empowering vision-language model with multi-modal in-context learning H Zhao, Z Cai, S Si, X Ma, K An, L Chen, Z Liu, S Wang, W Han, B Chang ICLR 2024, 2023 | 137* | 2023 |
An image is worth 1/2 tokens after layer 2: Plug-and-play inference acceleration for large vision-language models L Chen, H Zhao, T Liu, S Bai, J Lin, C Zhou, B Chang ECCV 2024 (Oral), 19-35, 2025 | 74 | 2025 |
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain L Chen, Y Zhang, S Ren, H Zhao, Z Cai, Y Wang, P Wang, X Meng, T Liu, ... ACL 2024, 2024 | 48* | 2024 |
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale H Zhao, X Ma, L Chen, S Si, R Wu, K An, P Yu, M Zhang, Q Li, B Chang NeurIPS 2024, 2024 | 16 | 2024 |
Ml-bench: Large language models leverage open-source libraries for machine learning tasks Y Liu, X Tang, Z Cai, J Lu, Y Zhang, Y Shao, Z Deng, H Hu, Z Yang, K An, ... arXiv e-prints, arXiv: 2311.09835, 2023 | 15* | 2023 |
Mmevalpro: Calibrating multimodal benchmarks towards trustworthy and efficient evaluation J Huang, L Chen, T Guo, F Zeng, Y Zhao, B Wu, Y Yuan, H Zhao, Z Guo, ... arXiv preprint arXiv:2407.00468, 2024 | 5 | 2024 |
Removing Camouflage and Revealing Collusion: Leveraging Gang-crime Pattern in Fraudster Detection L Wang, H Zhao, C Feng, W Liu, C Huang, M Santoni, M Cristofaro, ... KDD 2023, 5104-5115, 2023 | 5 | 2023 |
Traffic accident prediction methods based on multi-factor models HZ Zhao, G Rao Knowledge Science, Engineering and Management: 14th International Conference …, 2021 | 4 | 2021 |
A spark of vision-language intelligence: 2-dimensional autoregressive transformer for efficient finegrained image generation L Chen, S Tan, Z Cai, W Xie, H Zhao, Y Zhang, J Lin, J Bai, T Liu, ... arXiv preprint arXiv:2410.01912, 2024 | 3 | 2024 |
Coarse-to-fine dual encoders are better frame identification learners K An, C Zheng, B Gao, H Zhao, B Chang Findings of EMNLP 2023, 2023 | 3 | 2023 |
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey L Chen, Z Wang, S Ren, L Li, H Zhao, Y Li, Z Cai, H Guo, L Zhang, ... arXiv preprint arXiv:2412.18619, 2024 | 2 | 2024 |
LongViTU: Instruction Tuning for Long-Form Video Understanding R Wu, X Ma, H Ci, Y Fan, Y Wang, H Zhao, Q Li, Y Wang arXiv preprint arXiv:2501.05037, 2025 | | 2025 |
Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance H Zhao, S Si, L Chen, Y Zhang, M Sun, M Zhang, B Chang arXiv preprint arXiv:2411.14279, 2024 | | 2024 |
GATEAU: Selecting Influential Sample for Long Context Alignment S Si, H Zhao, G Chen, Y Li, K Luo, C Lv, K An, F Qi, B Chang, M Sun arXiv preprint arXiv:2410.15633, 2024 | | 2024 |
Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints K An, S Si, H Hu, H Zhao, Y Wang, Q Guo, B Chang arXiv preprint arXiv:2409.14469, 2024 | | 2024 |
Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning S Si, H Hu, H Zhao, S Zeng, K An, Z Cai, B Chang Findings of the Association for Computational Linguistics ACL 2024, 5533-5546, 2024 | | 2024 |
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation H Zhao, Z Cai, S Si, L Chen, Y He, K An, B Chang NAACL 2024, 2024 | | 2024 |
Selecting Influential Samples for Long Context Alignment via Homologous Models’ Guidance and Contextual Awareness Measurement S Si, H Zhao, G Chen, Y Li, K Luo, C Lv, K An, F Qi, B Chang, M Sun | | 2024 |
Empowering MultiModal Models’ In-Context Learning Ability through Large Language Models W Han, H Zhao, Z Cai Proceedings of the ACM Turing Award Celebration Conference-China 2023, 9-10, 2023 | | 2023 |