- Academic Search

C Chen, K Shu - AI Magazine, 2024 - Wiley Online Library

Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …

Save Cite Cited by 140 Related articles All 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Weak-to-strong jailbreaking on large language models

X Zhao, X Yang, T Pang, C Du, L Li, YX Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are vulnerable to jailbreak attacks-resulting in harmful,
unethical, or biased text generations. However, existing jailbreaking methods are …

Save Cite Cited by 61 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Large language model supply chain: A research agenda

S Wang, Y Zhao, X Hou, H Wang - ACM Transactions on Software …, 2024 - dl.acm.org

The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …

Save Cite Cited by 10 Related articles All 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Escalation risks from language models in military and diplomatic decision-making

JP Rivera, G Mukobi, A Reuel, M Lamparth… - Proceedings of the …, 2024 - dl.acm.org

Governments are increasingly considering integrating autonomous AI agents in high-stakes
military and foreign-policy decision-making, especially with the emergence of advanced …

Save Cite Cited by 33 Related articles All 8 versions Free GPT-4 DeepSeek Library Search

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Codechameleon: Personalized encryption framework for jailbreaking large language models

H Lv, X Wang, Y Zhang, C Huang, S Dou, J Ye… - arxiv preprint arxiv …, 2024 - arxiv.org

Adversarial misuse, particularly throughjailbreaking'that circumvents a model's safety and
ethical protocols, poses a significant challenge for Large Language Models (LLMs). This …

Save Cite Cited by 40 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] jair.org Full View

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang… - Journal of Artificial …, 2025 - jair.org

Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …

Save Cite Cited by 15 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Mission impossible: A statistical perspective on jailbreaking llms

J Su, J Kempe, K Ullrich - Advances in Neural Information …, 2025 - proceedings.neurips.cc

Large language models (LLMs) are trained on a deluge of text data with limited quality
control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as …

Save Cite Cited by 5 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Permute-and-flip: An optimally robust and watermarkable decoder for llms

X Zhao, L Li, YX Wang - arxiv preprint arxiv:2402.05864, 2024 - arxiv.org

In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder. It
enjoys robustness properties similar to the standard sampling decoder, but is provably up to …

Save Cite Cited by 16 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rapid optimization for jailbreaking llms via subconscious exploitation and echopraxia

G Shen, S Cheng, K Zhang, G Tao, S An, L Yan… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have become prevalent across diverse sectors,
transforming human life with their extraordinary reasoning and comprehension abilities. As …

Save Cite Cited by 14 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Position: Technical research and talent is needed for effective AI governance

A Reuel, L Soder, B Bucknall… - Forty-first International …, 2024 - openreview.net

In light of recent advancements in AI capabilities and the increasingly widespread
integration of AI systems into society, governments worldwide are actively seeking to …

Save Cite Cited by 6 Related articles All 5 versions Free GPT-4 DeepSeek View as HTML

Create alert

Cite

Advanced search

Saved to My library

On the safety of open-sourced large language models: Does alignment really prevent them from...

Combating misinformation in the age of llms: Opportunities and challenges

Weak-to-strong jailbreaking on large language models

Large language model supply chain: A research agenda

Escalation risks from language models in military and diplomatic decision-making

Codechameleon: Personalized encryption framework for jailbreaking large language models

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Mission impossible: A statistical perspective on jailbreaking llms

Permute-and-flip: An optimally robust and watermarkable decoder for llms

Rapid optimization for jailbreaking llms via subconscious exploitation and echopraxia

Position: Technical research and talent is needed for effective AI governance