Anti-backdoor learning: Training clean models on poisoned data
Backdoor attack has emerged as a major security threat to deep neural networks (DNNs).
While existing defense methods have demonstrated promising results on detecting or …
While existing defense methods have demonstrated promising results on detecting or …
Neural attention distillation: Erasing backdoor triggers from deep neural networks
Deep neural networks (DNNs) are known vulnerable to backdoor attacks, a training time
attack that injects a trigger pattern into a small proportion of training data so as to control the …
attack that injects a trigger pattern into a small proportion of training data so as to control the …
Backdoor defense with machine unlearning
Backdoor injection attack is an emerging threat to the security of neural networks, however,
there still exist limited effective defense methods against the attack. In this paper, we …
there still exist limited effective defense methods against the attack. In this paper, we …
Backdoor defense via deconfounded representation learning
Deep neural networks (DNNs) are recently shown to be vulnerable to backdoor attacks,
where attackers embed hidden backdoors in the DNN model by injecting a few poisoned …
where attackers embed hidden backdoors in the DNN model by injecting a few poisoned …
Poisoning attacks and defenses on artificial intelligence: A survey
Machine learning models have been widely adopted in several fields. However, most recent
studies have shown several vulnerabilities from attacks with a potential to jeopardize the …
studies have shown several vulnerabilities from attacks with a potential to jeopardize the …
On the exploitability of reinforcement learning with human feedback for large language models
Reinforcement Learning with Human Feedback (RLHF) is a methodology designed to align
Large Language Models (LLMs) with human preferences, playing an important role in LLMs …
Large Language Models (LLMs) with human preferences, playing an important role in LLMs …
GRIP-GAN: An attack-free defense through general robust inverse perturbation
Despite of its tremendous popularity and success in computer vision (CV) and natural
language processing, deep learning is inherently vulnerable to adversarial attacks in which …
language processing, deep learning is inherently vulnerable to adversarial attacks in which …
One4all: Manipulate one agent to poison the cooperative multi-agent reinforcement learning
Reinforcement Learning (RL) has achieved a plenty of breakthroughs in the past decade.
Notably, existing studies have shown that RL is suffered from poisoning attack, which results …
Notably, existing studies have shown that RL is suffered from poisoning attack, which results …
Backdoor attacks on crowd counting
Crowd counting is a regression task that estimates the number of people in a scene image,
which plays a vital role in a range of safety-critical applications, such as video surveillance …
which plays a vital role in a range of safety-critical applications, such as video surveillance …
Deeppoison: Feature transfer based stealthy poisoning attack for dnns
J Chen, L Zhang, H Zheng, X Wang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks are susceptible to poisoning attacks by purposely polluted training
data with specific triggers. As existing episodes mainly focused on attack success rate with …
data with specific triggers. As existing episodes mainly focused on attack success rate with …