Neural polarizer: A lightweight and effective backdoor defense via purifying poisoned features

M Zhu, S Wei, H Zha, B Wu - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent studies have demonstrated the susceptibility of deep neural networks to backdoor
attacks. Given a backdoored model, its prediction of a poisoned sample with trigger will be …

Shared adversarial unlearning: Backdoor mitigation by unlearning shared adversarial examples

S Wei, M Zhang, H Zha, B Wu - Advances in Neural …, 2023 - proceedings.neurips.cc
Backdoor attacks are serious security threats to machine learning models where an
adversary can inject poisoned samples into the training set, causing a backdoored model …

Enhancing fine-tuning based backdoor defense with sharpness-aware minimization

M Zhu, S Wei, L Shen, Y Fan… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Backdoor defense, which aims to detect or mitigate the effect of malicious triggers introduced
by attackers, is becoming increasingly critical for machine learning security and integrity …

Attacks in adversarial machine learning: A systematic survey from the life-cycle perspective

B Wu, Z Zhu, L Liu, Q Liu, Z He, S Lyu - arxiv preprint arxiv:2302.09457, 2023 - arxiv.org
Adversarial machine learning (AML) studies the adversarial phenomenon of machine
learning, which may make inconsistent or unexpected predictions with humans. Some …

Stealthy Backdoor Attack via Confidence-driven Sampling

P He, Y **ng, H Xu, J Ren, Y Cui, S Zeng… - … on Machine Learning … - openreview.net
Backdoor attacks facilitate unauthorized control in the testing stage by carefully injecting
harmful triggers during the training phase of deep neural networks. Previous works have …