VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, X Dong, Y Zang, P Zhang, ... arXiv preprint arXiv:2407.11691, 2024 | 47* | 2024 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding X Fang, K Mao, H Duan, X Zhao, Y Li, D Lin, K Chen arXiv preprint arXiv:2406.14515, 2024 | 29 | 2024 |
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Y Qiao, H Duan, X Fang, J Yang, L Chen, S Zhang, J Wang, D Lin, ... arXiv preprint arXiv:2406.14544, 2024 | 10 | 2024 |
Multimodal fusion of ehr in structures and semantics: Integrating clinical records and notes with hypergraph and llm H Cui, X Fang, R Xu, X Kan, JC Ho, C Yang arXiv preprint arXiv:2403.08818, 2024 | 7 | 2024 |
ProSA: Assessing and understanding the prompt sensitivity of LLMs J Zhuo, S Zhang, X Fang, H Duan, D Lin, K Chen arXiv preprint arXiv:2410.12405, 2024 | 6 | 2024 |
Mme-survey: A comprehensive survey on evaluation of multimodal llms C Fu, YF Zhang, S Yin, B Li, X Fang, S Zhao, H Duan, X Sun, Z Liu, ... arXiv preprint arXiv:2411.15296, 2024 | 5 | 2024 |
Open visual knowledge extraction via relation-oriented multimodality model prompting H Cui, X Fang, Z Zhang, R Xu, X Kan, X Liu, Y Yu, M Li, Y Song, C Yang Advances in Neural Information Processing Systems 36, 2024 | 5 | 2024 |
Redundancy Principles for MLLMs Benchmarks Z Zhang, X Zhao, X Fang, C Li, X Liu, X Min, H Duan, K Chen, G Zhai arXiv preprint arXiv:2501.13953, 2025 | | 2025 |