Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on evaluation of multimodal large language models
J Huang, J Zhang - arxiv preprint arxiv:2408.15769, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) mimic human perception and reasoning
system by integrating powerful Large Language Models (LLMs) with various modality …
system by integrating powerful Large Language Models (LLMs) with various modality …
A survey on multimodal benchmarks: In the era of large ai models
The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …
advancements in artificial intelligence, significantly enhancing the capability to understand …
Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems
Abstract visual reasoning (AVR) encompasses a suite of tasks whose solving requires the
ability to discover common concepts underlying the set of pictures through an analogy …
ability to discover common concepts underlying the set of pictures through an analogy …
Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task
Evaluating the reasoning capabilities of Vision-Language Models (VLMs) in complex visual
tasks provides valuable insights into their potential and limitations. In this work, we assess …
tasks provides valuable insights into their potential and limitations. In this work, we assess …
Towards Learning to Reason: Comparing LLMs with Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning
This work compares large language models (LLMs) and neuro-symbolic approaches in
solving Raven's progressive matrices (RPM), a visual abstract reasoning test that involves …
solving Raven's progressive matrices (RPM), a visual abstract reasoning test that involves …
The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks
There is increasing interest in tracking the capabilities of general intelligence foundation
models. This study benchmarks leading large language models and vision language …
models. This study benchmarks leading large language models and vision language …