Alleviating hallucination in large vision-language models with active retrieval augmentation

X Qu, Q Chen, W Wei, J Sun, J Dong - arxiv preprint arxiv:2408.00555, 2024 - arxiv.org
Despite the remarkable ability of large vision-language models (LVLMs) in image
comprehension, these models frequently generate plausible yet factually incorrect …

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du - arxiv preprint arxiv …, 2025 - arxiv.org
Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in
processing both visual and textual information. However, the critical challenge of alignment …

Bring Retrieval Augmented Generation to Google Gemini via External API: An Evaluation with BIG-Bench Dataset

H Lee, S Kim - 2024 - researchsquare.com
Abstract The integration of Retrieval Augmented Generation (RAG) into existing large
language models represents a significant shift towards more dynamic and context-aware AI …

[CITATION][C] 大规模视觉-语言模型的对齐与不对齐: 从可解释性的视角进行的调查

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du