- Academic Search

Mitigating object hallucinations in large vision-language models through visual contrastive decoding

Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org

With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …

Gem Citer Citeret af 1233 Relaterede artikler Alle 12 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foundation Models Defining a New Era in Vision: a Survey and Outlook

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Gem Citer Citeret af 144 Relaterede artikler Alle 4 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2025 - dl.acm.org

The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …

Gem Citer Citeret af 945 Relaterede artikler Alle 4 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Woodpecker: Hallucination correction for multimodal large language models

S Yin, C Fu, S Zhao, T Xu, H Wang, D Sui… - Science China …, 2024 - Springer

Hallucinations is a big shadow hanging over the rapidly evolving multimodal large language
models (MLLMs), referring to that the generated text is inconsistent with the image content …

Gem Citer Citeret af 171 Relaterede artikler Alle 3 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on hallucination in large vision-language models

H Liu, W Xue, Y Chen, D Chen, X Zhao, K Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent development of Large Vision-Language Models (LVLMs) has attracted growing
attention within the AI landscape for its practical implementation potential. However,`` …

Gem Citer Citeret af 149 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hallucination of multimodal large language models: A survey

Z Bai, P Wang, T **ao, T He, Z Han, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

This survey presents a comprehensive analysis of the phenomenon of hallucination in
multimodal large language models (MLLMs), also known as Large Vision-Language Models …

Gem Citer Citeret af 106 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models

MS Sepehri, Z Fabian, M Soltanolkotabi… - arxiv preprint arxiv …, 2024 - arxiv.org

Multimodal Large Language Models (MLLMs) have tremendous potential to improve the
accuracy, availability, and cost-effectiveness of healthcare by providing automated solutions …

Gem Citer Citeret af 97 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rlaif-v: Aligning mllms through open-source ai feedback for super gpt-4v trustworthiness

T Yu, H Zhang, Y Yao, Y Dang, D Chen, X Lu… - arxiv preprint arxiv …, 2024 - arxiv.org

Learning from feedback reduces the hallucination of multimodal large language models
(MLLMs) by aligning them with human preferences. While traditional methods rely on labor …

Gem Citer Citeret af 63 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Aligning modalities in vision large language models via preference fine-tuning

Y Zhou, C Cui, R Rafailov, C Finn, H Yao - arxiv preprint arxiv:2402.11411, 2024 - arxiv.org

Instruction-following Vision Large Language Models (VLLMs) have achieved significant
progress recently on a variety of tasks. These approaches merge strong pre-trained vision …

Gem Citer Citeret af 68 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Halc: Object hallucination reduction via adaptive focal-contrast decoding

Z Chen, Z Zhao, H Luo, H Yao, B Li, J Zhou - arxiv preprint arxiv …, 2024 - arxiv.org

While large vision-language models (LVLMs) have demonstrated impressive capabilities in
interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH). We …

Gem Citer Citeret af 52 Relaterede artikler Alle 7 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Mitigating object hallucinations in large vision-language models through visual contrastive decoding

A Survey of Multimodel Large Language Models

Foundation Models Defining a New Era in Vision: a Survey and Outlook

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

Woodpecker: Hallucination correction for multimodal large language models

A survey on hallucination in large vision-language models

Hallucination of multimodal large language models: A survey

MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models

Rlaif-v: Aligning mllms through open-source ai feedback for super gpt-4v trustworthiness

Aligning modalities in vision large language models via preference fine-tuning

Halc: Object hallucination reduction via adaptive focal-contrast decoding