DARPA's explainable AI (XAI) program: A retrospective
DARPA formulated the Explainable Artificial Intelligence (XAI) program in 2015 with the goal
to enable end users to better understand, trust, and effectively manage artificially intelligent …
to enable end users to better understand, trust, and effectively manage artificially intelligent …
Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses
M Goldblum, D Tsipras, C **, L Fowl, G Somepalli, M Goldblum… - arxiv preprint arxiv …, 2021 - arxiv.org
Data poisoning is a threat model in which a malicious actor tampers with training data to
manipulate outcomes at inference time. A variety of defenses against this threat model have …
manipulate outcomes at inference time. A variety of defenses against this threat model have …
Handcrafted backdoors in deep neural networks
When machine learning training is outsourced to third parties, $ backdoor $$ attacks $
become practical as the third party who trains the model may act maliciously to inject hidden …
become practical as the third party who trains the model may act maliciously to inject hidden …
Just rotate it: Deploying backdoor attacks via rotation transformation
Recent works have demonstrated that deep learning models are vulnerable to backdoor
poisoning attacks, where these attacks instill spurious correlations to external trigger …
poisoning attacks, where these attacks instill spurious correlations to external trigger …
Quarantine: Sparsity can uncover the trojan attack trigger for free
Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally
on most samples, yet to produce manipulated results for inputs attached with a particular …
on most samples, yet to produce manipulated results for inputs attached with a particular …
Distilling cognitive backdoor patterns within an image
This paper proposes a simple method to distill and detect backdoor patterns within an
image:\emph {Cognitive Distillation}(CD). The idea is to extract the" minimal essence" from …
image:\emph {Cognitive Distillation}(CD). The idea is to extract the" minimal essence" from …
Accumulative poisoning attacks on real-time data
Collecting training data from untrusted sources exposes machine learning services to
poisoning adversaries, who maliciously manipulate training data to degrade the model …
poisoning adversaries, who maliciously manipulate training data to degrade the model …