Adversarial attacks and defenses in images, graphs and text: A review
Deep neural networks (DNN) have achieved unprecedented success in numerous machine
learning tasks in various domains. However, the existence of adversarial examples raises …
learning tasks in various domains. However, the existence of adversarial examples raises …
Advances in adversarial attacks and defenses in computer vision: A survey
Deep Learning is the most widely used tool in the contemporary field of computer vision. Its
ability to accurately solve complex problems is employed in vision research to learn deep …
ability to accurately solve complex problems is employed in vision research to learn deep …
Are aligned neural networks adversarially aligned?
Large language models are now tuned to align with the goals of their creators, namely to be"
helpful and harmless." These models should respond helpfully to user questions, but refuse …
helpful and harmless." These models should respond helpfully to user questions, but refuse …
Trustllm: Trustworthiness in large language models
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …
attention for their excellent natural language processing capabilities. Nonetheless, these …
Theoretically principled trade-off between robustness and accuracy
We identify a trade-off between robustness and accuracy that serves as a guiding principle
in the design of defenses against adversarial examples. Although this problem has been …
in the design of defenses against adversarial examples. Although this problem has been …
Adversarial examples are not bugs, they are features
Adversarial examples have attracted significant attention in machine learning, but the
reasons for their existence and pervasiveness remain unclear. We demonstrate that …
reasons for their existence and pervasiveness remain unclear. We demonstrate that …
Certified adversarial robustness via randomized smoothing
We show how to turn any classifier that classifies well under Gaussian noise into a new
classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …
classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …
Robustbench: a standardized adversarial robustness benchmark
As a research community, we are still lacking a systematic understanding of the progress on
adversarial robustness which often makes it hard to identify the most promising ideas in …
adversarial robustness which often makes it hard to identify the most promising ideas in …
Ensemble adversarial training: Attacks and defenses
Adversarial examples are perturbed inputs designed to fool machine learning models.
Adversarial training injects such examples into training data to increase robustness. To …
Adversarial training injects such examples into training data to increase robustness. To …
Adversarial training for free!
Adversarial training, in which a network is trained on adversarial examples, is one of the few
defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high …
defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high …