Combating misinformation in the age of llms: Opportunities and challenges
Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …
and public trust. The emergence of large language models (LLMs) has great potential to …
Large language model supply chain: A research agenda
The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …
intelligence, introducing unprecedented capabilities in natural language processing and …
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
Weak-to-strong jailbreaking on large language models
Although significant efforts have been dedicated to aligning large language models (LLMs),
red-teaming reports suggest that these carefully aligned LLMs could still be jailbroken …
red-teaming reports suggest that these carefully aligned LLMs could still be jailbroken …
Escalation risks from language models in military and diplomatic decision-making
Governments are increasingly considering integrating autonomous AI agents in high-stakes
military and foreign-policy decision-making, especially with the emergence of advanced …
military and foreign-policy decision-making, especially with the emergence of advanced …
Mission impossible: A statistical perspective on jailbreaking llms
Large language models (LLMs) are trained on a deluge of text data with limited quality
control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as …
control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as …
Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs
In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder. It
enjoys robustness properties similar to the standard sampling decoder, but is provably up to …
enjoys robustness properties similar to the standard sampling decoder, but is provably up to …
Codechameleon: Personalized encryption framework for jailbreaking large language models
Adversarial misuse, particularly throughjailbreaking'that circumvents a model's safety and
ethical protocols, poses a significant challenge for Large Language Models (LLMs). This …
ethical protocols, poses a significant challenge for Large Language Models (LLMs). This …
Rapid optimization for jailbreaking llms via subconscious exploitation and echopraxia
Large Language Models (LLMs) have become prevalent across diverse sectors,
transforming human life with their extraordinary reasoning and comprehension abilities. As …
transforming human life with their extraordinary reasoning and comprehension abilities. As …
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI
As generative AI, particularly large language models (LLMs), become increasingly
integrated into production applications, new attack surfaces and vulnerabilities emerge and …
integrated into production applications, new attack surfaces and vulnerabilities emerge and …