Ai agents under threat: A survey of key security challenges and future pathways
An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or
makes decisions based on pre-defined objectives and data inputs. AI agents, capable of …
makes decisions based on pre-defined objectives and data inputs. AI agents, capable of …
Benchmarking large language models on cmexam-a comprehensive chinese medical exam dataset
Recent advancements in large language models (LLMs) have transformed the field of
question answering (QA). However, evaluating LLMs in the medical field is challenging due …
question answering (QA). However, evaluating LLMs in the medical field is challenging due …
Knowledge conflicts for llms: A survey
This survey provides an in-depth analysis of knowledge conflicts for large language models
(LLMs), highlighting the complex challenges they encounter when blending contextual and …
(LLMs), highlighting the complex challenges they encounter when blending contextual and …
Privacy in large language models: Attacks, defenses and future directions
The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …
effectively tackle various downstream NLP tasks and unify these tasks into generative …
Mllm-protector: Ensuring mllm's safety without hurting performance
The deployment of multimodal large language models (MLLMs) has brought forth a unique
vulnerability: susceptibility to malicious attacks through visual inputs. This paper investigates …
vulnerability: susceptibility to malicious attacks through visual inputs. This paper investigates …
Strengthening multimodal large language model with bootstrapped preference optimization
Abstract Multimodal Large Language Models (MLLMs) excel in generating responses based
on visual inputs. However, they often suffer from a bias towards generating responses …
on visual inputs. However, they often suffer from a bias towards generating responses …
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …
The instruction hierarchy: Training llms to prioritize privileged instructions
Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow
adversaries to overwrite a model's original instructions with their own malicious prompts. In …
adversaries to overwrite a model's original instructions with their own malicious prompts. In …
StruQ: Defending against prompt injection with structured queries
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated
applications, which perform text-based tasks by utilizing their advanced language …
applications, which perform text-based tasks by utilizing their advanced language …
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Instruction-tuned Large Language Models (LLMs) show impressive results in numerous
practical applications, but they lack essential safety features that are common in other areas …
practical applications, but they lack essential safety features that are common in other areas …