Google Tudós

J Huang, J Zhang - arxiv preprint arxiv:2408.15769, 2024 - arxiv.org

Multimodal Large Language Models (MLLMs) mimic human perception and reasoning
system by integrating powerful Large Language Models (LLMs) with various modality …

Mentés Hivatkozás Idézetek száma: 18 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on multimodal benchmarks: In the era of large ai models

L Li, G Chen, H Shi, J **ao, L Chen - arxiv preprint arxiv:2409.18142, 2024 - arxiv.org

The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …

Mentés Hivatkozás Idézetek száma: 4 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

M Małkiński, S Pawlonka, J Mańdziuk - arxiv preprint arxiv:2411.01173, 2024 - arxiv.org

Abstract visual reasoning (AVR) encompasses a suite of tasks whose solving requires the
ability to discover common concepts underlying the set of pictures through an analogy …

Mentés Hivatkozás Idézetek száma: 1 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task

M Vaishnav, T Tammet - arxiv preprint arxiv:2501.13620, 2025 - arxiv.org

Evaluating the reasoning capabilities of Vision-Language Models (VLMs) in complex visual
tasks provides valuable insights into their potential and limitations. In this work, we assess …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards Learning to Reason: Comparing LLMs with Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning

M Hersche, G Camposampiero, R Wattenhofer… - arxiv preprint arxiv …, 2024 - arxiv.org

This work compares large language models (LLMs) and neuro-symbolic approaches in
solving Raven's progressive matrices (RPM), a visual abstract reasoning test that involves …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks

IR Galatzer-Levy, D Munday, J McGiffin, X Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

There is increasing interest in tracking the capabilities of general intelligence foundation
models. This study benchmarks leading large language models and vision language …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

What is the visual cognition gap between humans and multimodal llms?

A survey on evaluation of multimodal large language models

A survey on multimodal benchmarks: In the era of large ai models

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

Cognitive Paradigms for Evaluating VLMs on Visual Reasoning Task

Towards Learning to Reason: Comparing LLMs with Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning

The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks