Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - arxiv preprint arxiv …, 2024 - arxiv.org
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

A comprehensive survey of hallucination mitigation techniques in large language models

SM Tonmoy, SM Zaman, V Jain, A Rani… - arxiv preprint arxiv …, 2024 - arxiv.org
As Large Language Models (LLMs) continue to advance in their ability to write human-like
text, a key challenge remains around their tendency to hallucinate generating content that …

Mm-safetybench: A benchmark for safety evaluation of multimodal large language models

X Liu, Y Zhu, J Gu, Y Lan, C Yang, Y Qiao - European Conference on …, 2024 - Springer
The security concerns surrounding Large Language Models (LLMs) have been extensively
explored, yet the safety of Multimodal Large Language Models (MLLMs) remains …

Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting

Y Wang, X Liu, Y Li, M Chen, C **ao - European Conference on Computer …, 2024 - Springer
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …

Mint: Evaluating llms in multi-turn interaction with tools and language feedback

X Wang, Z Wang, J Liu, Y Chen, L Yuan… - arxiv preprint arxiv …, 2023 - arxiv.org
To solve complex tasks, large language models (LLMs) often require multiple rounds of
interactions with the user, sometimes assisted by external tools. However, current evaluation …

Eyes closed, safety on: Protecting multimodal llms via image-to-text transformation

Y Gou, K Chen, Z Liu, L Hong, H Xu, Z Li… - … on Computer Vision, 2024 - Springer
Multimodal large language models (MLLMs) have shown impressive reasoning abilities.
However, they are also more vulnerable to jailbreak attacks than their LLM predecessors …

A survey of multimodal large language model from a data-centric perspective

T Bai, H Liang, B Wan, Y Xu, X Li, S Li, L Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) enhance the capabilities of standard large
language models by integrating and processing data from multiple modalities, including text …

The first to know: How token distributions reveal hidden knowledge in large vision-language models?

Q Zhao, M Xu, K Gupta, A Asthana, L Zheng… - … on Computer Vision, 2024 - Springer
Large vision-language models (LVLMs), designed to interpret and respond to human
instructions, occasionally generate hallucinated or harmful content due to inappropriate …

Jailbreakzoo: Survey, landscapes, and horizons in jailbreaking large language and vision-language models

H **, L Hu, X Li, P Zhang, C Chen, J Zhuang… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid evolution of artificial intelligence (AI) through developments in Large Language
Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements …

A survey of attacks on large vision-language models: Resources, advances, and future trends

D Liu, M Yang, X Qu, P Zhou, Y Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org
With the significant development of large models in recent years, Large Vision-Language
Models (LVLMs) have demonstrated remarkable capabilities across a wide range of …