Enhancing autonomous system security and resilience with generative AI: A comprehensive survey

M Andreoni, WT Lunardi, G Lawton, S Thakkar - IEEE Access, 2024 - ieeexplore.ieee.org
This survey explores the transformative role of Generative Artificial Intelligence (GenAI) in
enhancing the trustworthiness, reliability, and security of autonomous systems such as …

Cybench: A framework for evaluating cybersecurity capabilities and risks of language models

AK Zhang, N Perry, R Dulepet, J Ji, JW Lin… - arxiv preprint arxiv …, 2024 - arxiv.org
Language Model (LM) agents for cybersecurity that are capable of autonomously identifying
vulnerabilities and executing exploits have the potential to cause real-world impact …

[HTML][HTML] Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities

MA Ferrag, F Alwahedi, A Battah, B Cherif… - Internet of Things and …, 2025 - Elsevier
This paper provides a comprehensive review of the future of cybersecurity through
Generative AI and Large Language Models (LLMs). We explore LLM applications across …

Dynamic intelligence assessment: Benchmarking llms on the road to agi with a focus on model confidence

N Tihanyi, T Bisztray, RA Dubniczky… - … Conference on Big …, 2024 - ieeexplore.ieee.org
As machine intelligence evolves, the need to test and compare the problem-solving abilities
of different AI models grows. However, current benchmarks are often simplistic, allowing …

Advancing cyber incident timeline analysis through retrieval-augmented generation and large language models

FY Loumachi, MC Ghanem, MA Ferrag - Computers, 2025 - repository.londonmet.ac.uk
Cyber timeline analysis or forensic timeline analysis is critical in digital forensics and
incident response (DFIR) investigations. It involves examining artefacts and events …

[HTML][HTML] Enhancing Security in Software Design Patterns and Antipatterns: A Framework for LLM-Based Detection

R Andrade, J Torres, I Ortiz-Garcés - Electronics, 2025 - mdpi.com
The detection of security vulnerabilities in software design patterns and antipatterns is
crucial for maintaining robust and maintainable systems, particularly in dynamic Continuous …

Benchmarking and Evaluating Large Language Models in Phishing Detection for Small and Midsize Enterprises: A Comprehensive Analysis

J Zhang, P Wu, J London, D Tenney - IEEE Access, 2025 - ieeexplore.ieee.org
The proliferation of Generative Artificial Intelligence (GenAI) has driven significant innovation
but also introduced new security risks, particularly in social engineering attacks such as …

OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities

M Kouremetis, M Dotter, A Byrne, D Martin… - arxiv preprint arxiv …, 2025 - arxiv.org
The prospect of artificial intelligence (AI) competing in the adversarial landscape of cyber
security has long been considered one of the most impactful, challenging, and potentially …

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models

Y Tan, B Zheng, B Zheng, K Cao, H **g, J Wei… - arxiv preprint arxiv …, 2024 - arxiv.org
With the rapid advancement of Large Language Models (LLMs), significant safety concerns
have emerged. Fundamentally, the safety of large language models is closely linked to the …

Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training

YC Yu, TH Chiang, CW Tsai, CM Huang… - arxiv preprint arxiv …, 2025 - arxiv.org
Large Language Models (LLMs) have shown remarkable advancements in specialized
fields such as finance, law, and medicine. However, in cybersecurity, we have noticed a lack …