A survey on evaluation of multimodal large language models
J Huang, J Zhang - arxiv preprint arxiv:2408.15769, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) mimic human perception and reasoning
system by integrating powerful Large Language Models (LLMs) with various modality …
system by integrating powerful Large Language Models (LLMs) with various modality …
Large Language Models Can Be Contextual Privacy Protection Learners
Abstract The proliferation of Large Language Models (LLMs) has driven considerable
interest in fine-tuning them with domain-specific data to create specialized language …
interest in fine-tuning them with domain-specific data to create specialized language …
Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
Visual Question Answering (VQA) is a challenge task that combines natural language
processing and computer vision techniques and gradually becomes a benchmark test task …
processing and computer vision techniques and gradually becomes a benchmark test task …
A survey on multimodal benchmarks: In the era of large ai models
The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …
advancements in artificial intelligence, significantly enhancing the capability to understand …
Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models
J Liu, Y Li, B **ao, Y Jian, Z Qin, T Shao… - arxiv preprint arxiv …, 2024 - arxiv.org
There have been recent efforts to extend the Chain-of-Thought (CoT) paradigm to
Multimodal Large Language Models (MLLMs) by finding visual clues in the input scene …
Multimodal Large Language Models (MLLMs) by finding visual clues in the input scene …
Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches
A Mumuni, F Mumuni - arxiv preprint arxiv:2501.03151, 2025 - arxiv.org
Generative artificial intelligence (AI) systems based on large-scale pretrained foundation
models (PFMs) such as vision-language models, large language models (LLMs), diffusion …
models (PFMs) such as vision-language models, large language models (LLMs), diffusion …
AI Benchmarks and Datasets for LLM Evaluation
T Ivanov, V Penchev - arxiv preprint arxiv:2412.01020, 2024 - arxiv.org
LLMs demand significant computational resources for both pre-training and fine-tuning,
requiring distributed computing capabilities due to their large model sizes\cite …
requiring distributed computing capabilities due to their large model sizes\cite …
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
Active perception, a crucial human capability, involves setting a goal based on the current
understanding of the environment and performing actions to achieve that goal. Despite …
understanding of the environment and performing actions to achieve that goal. Despite …
LLM Logical Reasoning Related to Aesthetic Universals
The recent surge in popularity of LLMs has led to increased interest in them and to extensive
research and evaluation of their reasoning abilities. Induction, deduction, and abduction …
research and evaluation of their reasoning abilities. Induction, deduction, and abduction …