A Survey of Multimodel Large Language Models
Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org
With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …
including vision, the technology of large language models is evolving from a single modality …
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Self-correction is an approach to improving responses from large language models (LLMs)
by refining the responses using LLMs during inference. Prior work has proposed various self …
by refining the responses using LLMs during inference. Prior work has proposed various self …
A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions
The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …
natural language processing (NLP), fueling a paradigm shift in information acquisition …
Mitigating object hallucinations in large vision-language models through visual contrastive decoding
Abstract Large Vision-Language Models (LVLMs) have advanced considerably intertwining
visual recognition and language understanding to generate content that is not only coherent …
visual recognition and language understanding to generate content that is not only coherent …
Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback
Abstract Multimodal Large Language Models (MLLMs) have recently demonstrated
impressive capabilities in multimodal understanding reasoning and interaction. However …
impressive capabilities in multimodal understanding reasoning and interaction. However …
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation
Hallucination posed as a pervasive challenge of multi-modal large language models
(MLLMs) has significantly impeded their real-world usage that demands precise judgment …
(MLLMs) has significantly impeded their real-world usage that demands precise judgment …
HallusionBench: an advanced diagnostic suite for entangled language hallucination and visual illusion in large vision-language models
We introduce" HallusionBench" a comprehensive benchmark designed for the evaluation of
image-context reasoning. This benchmark presents significant challenges to advanced large …
image-context reasoning. This benchmark presents significant challenges to advanced large …
[PDF][PDF] HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
Large language models (LLMs), after being aligned with vision models and integrated into
vision-language models (VLMs), can bring impressive improvement in image reasoning …
vision-language models (VLMs), can bring impressive improvement in image reasoning …
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model
We introduce InternLM-XComposer2, a cutting-edge vision-language model excelling in free-
form text-image composition and comprehension. This model goes beyond conventional …
form text-image composition and comprehension. This model goes beyond conventional …
Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …