Follow
Kaichen Zhang
Title
Cited by
Cited by
Year
Llava-onevision: Easy visual task transfer
B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang, K Zhang, P Zhang, Y Li, ...
arXiv preprint arXiv:2408.03326, 2024
2452024
Long Context Transfer from Language to Vision
P Zhang, K Zhang, B Li, G Zeng, J Yang, Y Zhang, Z Wang, H Tan, C Li, ...
arXiv preprint arXiv:2406.16852, 2024
74*2024
Llava-next: Stronger llms supercharge multimodal capabilities in the wild
B Li, K Zhang, H Zhang, D Guo, R Zhang, F Li, Y Zhang, Z Liu, C Li
May, 2024
492024
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
K Zhang, B Li, P Zhang, F Pu, JA Cahyono, K Hu, S Liu, Y Zhang, J Yang, ...
arXiv preprint arXiv:2407.12772, 2024
272024
Lmms-eval: Accelerating the development of large multimoal models
B Li, P Zhang, K Zhang, F Pu, X Du, Y Dong, H Liu, Y Zhang, G Zhang, ...
March, 2024
21*2024
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
Y Zhang, K Zhang, B Li, F Pu, CA Setiadharma, J Yang, Z Liu
arXiv preprint arXiv:2405.03272, 2024
42024
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
K Zhang, Y Shen, B Li, Z Liu
arXiv preprint arXiv:2411.14982, 2024
2024
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
J Ni, Y Song, D Ghosal, B Li, DJ Zhang, X Yue, F Xue, Z Zheng, K Zhang, ...
arXiv preprint arXiv:2410.13754, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–8