- Academic Search

Articles

Scholar

About 110 results (0.03 sec)

Llava-onevision: Easy visual task transfer

B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang… - ar** mathematical reasoning for multimodal large language models

W Shi, Z Hu, Y Bin, J Liu, Y Yang, SK Ng, L Bing… - ar** general-purpose instruction-following models, eg, ChatGPT. To this end, we …

Save Cite Cited by 67 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Llava-critic: Learning to evaluate multimodal models

T **ong, X Wang, D Guo, Q Ye, H Fan, Q Gu… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce LLaVA-Critic, the first open-source large multimodal model (LMM) designed as
a generalist evaluator to assess performance across a wide range of multimodal tasks …

Save Cite Cited by 17 Related articles All 3 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

Llava-onevision: Easy visual task transfer

Llava-critic: Learning to evaluate multimodal models