- Academic Search

JA Irvin, ER Liu, JC Chen, I Dormoy, J Kim… - arxiv preprint arxiv …, 2024 - arxiv.org

Large vision and language assistants have enabled new capabilities for interpreting natural
images. These approaches have recently been adapted to earth observation data, but they …

Enregistrer Citer Cité 5 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du - arxiv preprint arxiv …, 2025 - arxiv.org

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in
processing both visual and textual information. However, the critical challenge of alignment …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

J Chen, T Zhang, S Huang, Y Niu, L Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite the recent breakthroughs achieved by Large Vision Language Models (LVLMs) in
understanding and responding to complex visual-textual contexts, their inherent …

Enregistrer Citer Cité 2 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Safe+ Safe= Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models

C Cui, G Deng, A Zhang, J Zheng, Y Li, L Gao… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in Large Vision-Language Models (LVLMs) have showcased strong
reasoning abilities across multiple modalities, achieving significant breakthroughs in various …

Enregistrer Citer Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

VidHal: Benchmarking Temporal Hallucinations in Vision LLMs

WY Choong, Y Guo, M Kankanhalli - arxiv preprint arxiv:2411.16771, 2024 - arxiv.org

Vision Large Language Models (VLLMs) are widely acknowledged to be prone to
hallucination. Existing research addressing this problem has primarily been confined to …

Enregistrer Citer Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

Y Wang, Z Zhu, H Liu, Y Liao, H Liu, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Multimodal large language models (MLLMs) excel at multimodal perception and
understanding, yet their tendency to generate hallucinated or inaccurate responses …

Enregistrer Citer Autres articles Les 3 versions Free GPT-4 Version HTML

[CITATION][C] 大规模视觉-语言模型的对齐与不对齐: 从可解释性的视角进行的调查

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du

Enregistrer Citer Autres articles

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Investigating and mitigating the multimodal hallucination snowballing in large vision-language...

Teochat: A large vision-language assistant for temporal earth observation data

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Safe+ Safe= Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models

VidHal: Benchmarking Temporal Hallucinations in Vision LLMs

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

[CITATION][C] 大规模视觉-语言模型的对齐与不对齐: 从可解释性的视角进行的调查