A survey of adversarial defenses and robustness in nlp

S Goyal, S Doddapaneni, MM Khapra… - ACM Computing …, 2023 - dl.acm.org
In the past few years, it has become increasingly evident that deep neural networks are not
resilient enough to withstand adversarial perturbations in input data, leaving them …

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G **, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Ai robustness: a human-centered perspective on technological challenges and opportunities

A Tocchetti, L Corti, A Balayn, M Yurrita… - ACM Computing …, 2022 - dl.acm.org
Despite the impressive performance of Artificial Intelligence (AI) systems, their robustness
remains elusive and constitutes a key issue that impedes large-scale adoption. Besides …

Sok: Certified robustness for deep neural networks

L Li, T **e, B Li - 2023 IEEE symposium on security and privacy …, 2023 - ieeexplore.ieee.org
Great advances in deep neural networks (DNNs) have led to state-of-the-art performance on
a wide range of tasks. However, recent studies have shown that DNNs are vulnerable to …

Interactive model cards: A human-centered approach to model documentation

A Crisan, M Drouhard, J Vig, N Rajani - … of the 2022 ACM Conference on …, 2022 - dl.acm.org
Deep learning models for natural language processing (NLP) are increasingly adopted and
deployed by analysts without formal training in NLP or machine learning (ML). However, the …

Introduction to neural network verification

A Albarghouthi - Foundations and Trends® in Programming …, 2021 - nowpublishers.com
Deep learning has transformed the way we think of software and what it can do. But deep
neural networks are fragile and their behaviors are often surprising. In many settings, we …

RS-Del: Edit distance robustness certificates for sequence classifiers via randomized deletion

Z Huang, NG Marchant, K Lucas… - Advances in …, 2023 - proceedings.neurips.cc
Randomized smoothing is a leading approach for constructing classifiers that are certifiably
robust against adversarial examples. Existing work on randomized smoothing has focused …

NLP verification: Towards a general methodology for certifying robustness

M Casadio, T Dinkar, E Komendantskaya… - arxiv preprint arxiv …, 2024 - arxiv.org
Machine Learning (ML) has exhibited substantial success in the field of Natural Language
Processing (NLP). For example large language models have empirically proven to be …

Reliability assurance for deep neural network architectures against numerical defects

L Li, Y Zhang, L Ren, Y **ong… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
With the widespread deployment of deep neural networks (DNNs), ensuring the reliability of
DNN-based systems is of great importance. Serious reliability issues such as system failures …

Unit: a unified look at certified robust training against text adversarial perturbation

M Ye, Z Yin, T Zhang, T Du, J Chen… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recent years have witnessed a surge of certified robust training pipelines against text
adversarial perturbation constructed by synonym substitutions. Given a base model, existing …