- Academic Search

X Qu, Q Chen, W Wei, J Sun, J Dong - arxiv preprint arxiv:2408.00555, 2024 - arxiv.org

Despite the remarkable ability of large vision-language models (LVLMs) in image
comprehension, these models frequently generate plausible yet factually incorrect …

Save Cite Cited by 9 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du - arxiv preprint arxiv …, 2025 - arxiv.org

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in
processing both visual and textual information. However, the critical challenge of alignment …

[Free GPT-4]
[DeepSeek]

[PDF] researchsquare.com

Bring Retrieval Augmented Generation to Google Gemini via External API: An Evaluation with BIG-Bench Dataset

H Lee, S Kim - 2024 - researchsquare.com

Abstract The integration of Retrieval Augmented Generation (RAG) into existing large
language models represents a significant shift towards more dynamic and context-aware AI …

Save Cite Cited by 4 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[CITATION][C] 大规模视觉-语言模型的对齐与不对齐: 从可解释性的视角进行的调查

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

Towards Retrieval-Augmented Architectures for Image Captioning

Alleviating hallucination in large vision-language models with active retrieval augmentation

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

Bring Retrieval Augmented Generation to Google Gemini via External API: An Evaluation with BIG-Bench Dataset

[CITATION][C] 大规模视觉-语言模型的对齐与不对齐: 从可解释性的视角进行的调查