Pre-trained trojan attacks for visual recognition

A Liu, X Liu, X Zhang, Y **ao, Y Zhou, S Liang… - International Journal of …, 2025 - Springer
Pre-trained vision models (PVMs) have become a dominant component due to their
exceptional performance when fine-tuned for downstream tasks. However, the presence of …

A survey of backdoor attacks and defenses on large language models: Implications for security measures

S Zhao, M Jia, Z Guo, L Gan, X Xu, X Wu, J Fu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …

RLHFPoison: Reward poisoning attack for reinforcement learning with human feedback in large language models

J Wang, J Wu, M Chen, Y Vorobeychik… - Proceedings of the …, 2024 - aclanthology.org
Abstract Reinforcement Learning with Human Feedback (RLHF) is a methodology designed
to align Large Language Models (LLMs) with human preferences, playing an important role …

Weak-to-Strong Backdoor Attack for Large Language Models

S Zhao, L Gan, Z Guo, X Wu, L **ao, X Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite being widely applied due to their exceptional capabilities, Large Language Models
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …

TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models

P Cheng, Y Ding, T Ju, Z Wu, W Du, P Yi… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have raised concerns about potential security threats
despite performing significantly in Natural Language Processing (NLP). Backdoor attacks …

New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook

M Yang, T Zhu, C Liu, WL Zhou, S Yu, PS Yu - arxiv preprint arxiv …, 2024 - arxiv.org
Thanks to the explosive growth of data and the development of computational resources, it is
possible to build pre-trained models that can achieve outstanding performance on various …

A Survey of Recent Backdoor Attacks and Defenses in Large Language Models

S Zhao, M Jia, Z Guo, L Gan, X XU, X Wu, J Fu… - … on Machine Learning … - openreview.net
Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …