Securing large language models: Addressing bias, misinformation, and prompt attacks

B Peng, K Chen, M Li, P Feng, Z Bi, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) demonstrate impressive capabilities across various fields,
yet their increasing use raises critical security concerns. This article reviews recent literature …

Hallucination of multimodal large language models: A survey

Z Bai, P Wang, T **ao, T He, Z Han, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
This survey presents a comprehensive analysis of the phenomenon of hallucination in
multimodal large language models (MLLMs), also known as Large Vision-Language Models …

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

J Li, K Pan, Z Ge, M Gao, W Ji, W Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Recent advancements in Multimodal Large Language Models (MLLMs) have been utilizing
Visual Prompt Generators (VPGs) to convert visual features into tokens that LLMs can …

A comprehensive survey of hallucination in large language, image, video and audio foundation models

P Sahoo, P Meharia, A Ghosh, S Saha, V Jain… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid advancement of foundation models (FMs) across language, image, audio, and
video domains has shown remarkable capabilities in diverse tasks. However, the …

Less is more: Mitigating multimodal hallucination from an eos decision perspective

Z Yue, L Zhang, Q ** - arxiv preprint arxiv:2402.14545, 2024 - arxiv.org
Large Multimodal Models (LMMs) often suffer from multimodal hallucinations, wherein they
may create content that is not present in the visual inputs. In this paper, we explore a new …

Cogcom: Train large vision-language models diving into details through chain of manipulations

J Qi, M Ding, W Wang, Y Bai, Q Lv, W Hong… - arxiv preprint arxiv …, 2024 - arxiv.org
Vision-Language Models (VLMs) have demonstrated their broad effectiveness thanks to
extensive training in aligning visual instructions to responses. However, such training of …

Towards unified multimodal editing with enhanced knowledge collaboration

K Pan, Z Fan, J Li, Q Yu, H Fei, S Tang… - Advances in …, 2025 - proceedings.neurips.cc
The swift advancement in Multimodal LLMs (MLLMs) also presents significant challenges for
effective knowledge editing. Current methods, including intrinsic knowledge editing and …

Data shunt: Collaboration of small and large models for lower costs and better performance

D Chen, Y Zhuang, S Zhang, J Liu, S Dong… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Pretrained large models, particularly large language models, have garnered increasing
attention, as they have demonstrated remarkable abilities through contextual learning …

Seeing clearly, answering incorrectly: A multimodal robustness benchmark for evaluating mllms on leading questions

Y Liu, Z Liang, Y Wang, M He, J Li, B Zhao - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) have exhibited impressive capabilities in
visual understanding and reasoning, providing sightly reasonable answers, such as image …

Fake artificial intelligence generated contents (faigc): A survey of theories, detection methods, and opportunities

X Yu, Y Wang, Y Chen, Z Tao, D **, S Song… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, generative artificial intelligence models, represented by Large Language
Models (LLMs) and Diffusion Models (DMs), have revolutionized content production …