Recent advances in adversarial training for adversarial robustness

T Bai, J Luo, J Zhao, B Wen, Q Wang - arxiv preprint arxiv:2102.01356, 2021 - arxiv.org
Adversarial training is one of the most effective approaches defending against adversarial
examples for deep learning models. Unlike other defense strategies, adversarial training …

Learning from noisy labels with deep neural networks: A survey

H Song, M Kim, D Park, Y Shin… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Deep learning has achieved remarkable success in numerous domains with help from large
amounts of big data. However, the quality of data labels is a concern because of the lack of …

Robustbench: a standardized adversarial robustness benchmark

F Croce, M Andriushchenko, V Sehwag… - arxiv preprint arxiv …, 2020 - arxiv.org
As a research community, we are still lacking a systematic understanding of the progress on
adversarial robustness which often makes it hard to identify the most promising ideas in …

Measuring robustness to natural distribution shifts in image classification

R Taori, A Dave, V Shankar, N Carlini… - Advances in …, 2020 - proceedings.neurips.cc
We study how robust current ImageNet models are to distribution shifts arising from natural
variations in datasets. Most research on robustness focuses on synthetic image …

Label-only membership inference attacks

CA Choquette-Choo, F Tramer… - International …, 2021 - proceedings.mlr.press
Membership inference is one of the simplest privacy threats faced by machine learning
models that are trained on private sensitive data. In this attack, an adversary infers whether a …

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Improving robustness against common corruptions by covariate shift adaptation

S Schneider, E Rusak, L Eck… - Advances in neural …, 2020 - proceedings.neurips.cc
Today's state-of-the-art machine vision models are vulnerable to image corruptions like
blurring or compression artefacts, limiting their performance in many real-world applications …

Randaugment: Practical automated data augmentation with a reduced search space

ED Cubuk, B Zoph, J Shlens… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
Recent work on automated augmentation strategies has led to state-of-the-art results in
image classification and object detection. An obstacle to a large-scale adoption of these …

Adversarial examples are not bugs, they are features

A Ilyas, S Santurkar, D Tsipras… - Advances in neural …, 2019 - proceedings.neurips.cc
Adversarial examples have attracted significant attention in machine learning, but the
reasons for their existence and pervasiveness remain unclear. We demonstrate that …

Certified adversarial robustness via randomized smoothing

J Cohen, E Rosenfeld, Z Kolter - international conference on …, 2019 - proceedings.mlr.press
We show how to turn any classifier that classifies well under Gaussian noise into a new
classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …