Llava-onevision: Easy visual task transfer

B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang… - ar** mathematical reasoning for multimodal large language models
W Shi, Z Hu, Y Bin, J Liu, Y Yang, SK Ng, L Bing… - ar** general-purpose instruction-following models, eg, ChatGPT. To this end, we …

Llava-critic: Learning to evaluate multimodal models

T **ong, X Wang, D Guo, Q Ye, H Fan, Q Gu… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce LLaVA-Critic, the first open-source large multimodal model (LMM) designed as
a generalist evaluator to assess performance across a wide range of multimodal tasks …