Security and privacy challenges of large language models: A survey
Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …
contributed to multiple fields, such as generating and summarizing text, language …
Metamath: Bootstrap your own mathematical questions for large language models
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …
and exhibited excellent problem-solving ability. Despite the great success, most existing …
Red-Teaming for generative AI: Silver bullet or security theater?
In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …
Jailbreak attacks and defenses against large language models: A survey
Large Language Models (LLMs) have performed exceptionally in various text-generative
tasks, including question answering, translation, code completion, etc. However, the over …
tasks, including question answering, translation, code completion, etc. However, the over …
A comprehensive study of jailbreak attack versus defense for large language models
Abstract Large Language Models (LLMs) have increasingly become central to generating
content with potential societal impacts. Notably, these models have demonstrated …
content with potential societal impacts. Notably, these models have demonstrated …
A causal explainable guardrails for large language models
Large Language Models (LLMs) have shown impressive performance in natural language
tasks, but their outputs can exhibit undesirable attributes or biases. Existing methods for …
tasks, but their outputs can exhibit undesirable attributes or biases. Existing methods for …
Jailbreakzoo: Survey, landscapes, and horizons in jailbreaking large language and vision-language models
The rapid evolution of artificial intelligence (AI) through developments in Large Language
Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements …
Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements …
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
Safedecoding: Defending against jailbreak attacks via safety-aware decoding
As large language models (LLMs) become increasingly integrated into real-world
applications such as code generation and chatbot assistance, extensive efforts have been …
applications such as code generation and chatbot assistance, extensive efforts have been …
Safe unlearning: A surprisingly effective and generalizable solution to defend against jailbreak attacks
LLMs are known to be vulnerable to jailbreak attacks, even after safety alignment. An
important observation is that, while different types of jailbreak attacks can generate …
important observation is that, while different types of jailbreak attacks can generate …