[HTML][HTML] When llms meet cybersecurity: A systematic literature review

J Zhang, H Bu, H Wen, Y Liu, H Fei… - …, 2025 - cybersecurity.springeropen.com
The rapid development of large language models (LLMs) has opened new avenues across
various fields, including cybersecurity, which faces an evolving threat landscape and …

Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety

P Röttger, F Pernisi, B Vidgen, D Hovy - arxiv preprint arxiv:2404.05399, 2024 - arxiv.org
The last two years have seen a rapid growth in concerns around the safety of large
language models (LLMs). Researchers and practitioners have met these concerns by …

Detectors for safe and reliable llms: Implementations, uses, and limitations

S Achintalwar, AA Garcia, A Anaby-Tavor… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output
to biased and toxic generations. Due to several limiting factors surrounding LLMs (training …

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

A Rawat, S Schoepf, G Zizzo, G Cornacchia… - arxiv preprint arxiv …, 2024 - arxiv.org
As generative AI, particularly large language models (LLMs), become increasingly
integrated into production applications, new attack surfaces and vulnerabilities emerge and …

Decolonial AI Alignment: Openness, Viśesa-Dharma, and Including Excluded Knowledges

KR Varshney - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org
Prior work has explicated the coloniality of artificial intelligence (AI) development and
deployment through mechanisms such as extractivism, automation, sociological …

Alignment studio: Aligning large language models to particular contextual regulations

S Achintalwar, I Baldini, D Bouneffouf… - IEEE Internet …, 2024 - ieeexplore.ieee.org
The alignment of large language models is usually done by model providers to add or
control behaviors that are common or universally understood across use cases and …

DARE to Diversify: DAta Driven and Diverse LLM REd Teaming

M Nagireddy, B Guillén Pegueroles… - Proceedings of the 30th …, 2024 - dl.acm.org
Large language models (LLMs) have been rapidly adopted, as showcased by ChatGPT's
overnight popularity, and are integrated in products used by millions of people every day …

Dynamic normativity: Necessary and sufficient conditions for value alignment

NK Corrêa - arxiv preprint arxiv:2406.11039, 2024 - arxiv.org
The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence
across all Humanities disciplines, revolves around the intricacies of morality and normativity …

Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs

G Zizzo, G Cornacchia, K Fraser, MZ Hameed… - arxiv preprint arxiv …, 2025 - arxiv.org
As large language models (LLMs) become integrated into everyday applications, ensuring
their robustness and security is increasingly critical. In particular, LLMs can be manipulated …

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Y Huang, C Gao, S Wu, H Wang, X Wang… - arxiv preprint arxiv …, 2025 - arxiv.org
Generative Foundation Models (GenFMs) have emerged as transformative tools. However,
their widespread adoption raises critical concerns regarding trustworthiness across …