Combating misinformation in the age of llms: Opportunities and challenges
Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …
and public trust. The emergence of large language models (LLMs) has great potential to …
Survey of vulnerabilities in large language models revealed by adversarial attacks
Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as
they integrate more deeply into complex systems, the urgency to scrutinize their security …
they integrate more deeply into complex systems, the urgency to scrutinize their security …
Trustllm: Trustworthiness in large language models
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …
attention for their excellent natural language processing capabilities. Nonetheless, these …
Jailbreak and guard aligned language models with only few in-context demonstrations
Large Language Models (LLMs) have shown remarkable success in various tasks, yet their
safety and the risk of generating harmful content remain pressing concerns. In this paper, we …
safety and the risk of generating harmful content remain pressing concerns. In this paper, we …
[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …
natural language processing capabilities. Nonetheless, these LLMs present many …
Foundational challenges in assuring alignment and safety of large language models
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …
language models (LLMs). These challenges are organized into three different categories …
Autodan: Generating stealthy jailbreak prompts on aligned large language models
The aligned Large Language Models (LLMs) are powerful language understanding and
decision-making tools that are created through extensive alignment with human feedback …
decision-making tools that are created through extensive alignment with human feedback …
Multilingual jailbreak challenges in large language models
While large language models (LLMs) exhibit remarkable capabilities across a wide range of
tasks, they pose potential safety concerns, such as the``jailbreak''problem, wherein …
tasks, they pose potential safety concerns, such as the``jailbreak''problem, wherein …
Making them ask and answer: Jailbreaking large language models in few queries via disguise and reconstruction
In recent years, large language models (LLMs) have demonstrated notable success across
various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is …
various tasks, but the trustworthiness of LLMs is still an open problem. One specific threat is …
How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms
Most traditional AI safety research has approached AI models as machines and centered on
algorithm-focused attacks developed by security experts. As large language models (LLMs) …
algorithm-focused attacks developed by security experts. As large language models (LLMs) …