Security and privacy challenges of large language models: A survey

BC Das, MH Amini, Y Wu - ACM Computing Surveys, 2024 - dl.acm.org
Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …

Protecting your llms with information bottleneck

Z Liu, Z Wang, L Xu, J Wang, L Song, T Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The advent of large language models (LLMs) has revolutionized the field of natural
language processing, yet they might be attacked to produce harmful content. Despite efforts …

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

A Rawat, S Schoepf, G Zizzo, G Cornacchia… - arxiv preprint arxiv …, 2024 - arxiv.org
As generative AI, particularly large language models (LLMs), become increasingly
integrated into production applications, new attack surfaces and vulnerabilities emerge and …

Improved techniques for optimization-based jailbreaking on large language models

X Jia, T Pang, C Du, Y Huang, J Gu, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are being rapidly developed, and a key component of their
widespread deployment is their safety-related alignment. Many red-teaming efforts aim to …

Blackdan: A black-box multi-objective approach for effective and contextual jailbreaking of large language models

X Wang, VSJ Huang, R Chen, H Wang, C Pan… - arxiv preprint arxiv …, 2024 - arxiv.org
While large language models (LLMs) exhibit remarkable capabilities across various tasks,
they encounter potential security risks such as jailbreak attacks, which exploit vulnerabilities …