Pre-trained trojan attacks for visual recognition
Pre-trained vision models (PVMs) have become a dominant component due to their
exceptional performance when fine-tuned for downstream tasks. However, the presence of …
exceptional performance when fine-tuned for downstream tasks. However, the presence of …
A survey of backdoor attacks and defenses on large language models: Implications for security measures
Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …
understanding and complex problem-solving, achieve state-of-the-art performance on …
RLHFPoison: Reward poisoning attack for reinforcement learning with human feedback in large language models
Abstract Reinforcement Learning with Human Feedback (RLHF) is a methodology designed
to align Large Language Models (LLMs) with human preferences, playing an important role …
to align Large Language Models (LLMs) with human preferences, playing an important role …
Weak-to-Strong Backdoor Attack for Large Language Models
Despite being widely applied due to their exceptional capabilities, Large Language Models
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …
TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
Large language models (LLMs) have raised concerns about potential security threats
despite performing significantly in Natural Language Processing (NLP). Backdoor attacks …
despite performing significantly in Natural Language Processing (NLP). Backdoor attacks …
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Thanks to the explosive growth of data and the development of computational resources, it is
possible to build pre-trained models that can achieve outstanding performance on various …
possible to build pre-trained models that can achieve outstanding performance on various …
A Survey of Recent Backdoor Attacks and Defenses in Large Language Models
Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …
understanding and complex problem-solving, achieve state-of-the-art performance on …