A survey of adversarial defenses and robustness in nlp

S Goyal, S Doddapaneni, MM Khapra… - ACM Computing …, 2023 - dl.acm.org
In the past few years, it has become increasingly evident that deep neural networks are not
resilient enough to withstand adversarial perturbations in input data, leaving them …

Robust natural language processing: Recent advances, challenges, and future directions

M Omar, S Choi, DH Nyang, D Mohaisen - IEEE Access, 2022 - ieeexplore.ieee.org
Recent natural language processing (NLP) techniques have accomplished high
performance on benchmark data sets, primarily due to the significant improvement in the …

Adversarial glue: A multi-task benchmark for robustness evaluation of language models

B Wang, C Xu, S Wang, Z Gan, Y Cheng, J Gao… - arxiv preprint arxiv …, 2021 - arxiv.org
Large-scale pre-trained language models have achieved tremendous success across a
wide range of natural language understanding (NLU) tasks, even surpassing human …

Hidden trigger backdoor attack on {NLP} models via linguistic style manipulation

X Pan, M Zhang, B Sheng, J Zhu, M Yang - 31st USENIX Security …, 2022 - usenix.org
The vulnerability of deep neural networks (DNN) to backdoor (trojan) attacks is extensively
studied for the image domain. In a backdoor attack, a DNN is modified to exhibit expected …

“real attackers don't compute gradients”: bridging the gap between adversarial ml research and practice

G Apruzzese, HS Anderson, S Dambra… - … IEEE Conference on …, 2023 - ieeexplore.ieee.org
Recent years have seen a proliferation of research on adversarial machine learning.
Numerous papers demonstrate powerful algorithmic attacks against a wide variety of …

Risk taxonomy, mitigation, and assessment benchmarks of large language model systems

T Cui, Y Wang, C Fu, Y **ao, S Li, X Deng, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have strong capabilities in solving diverse natural language
processing tasks. However, the safety and security issues of LLM systems have become the …

Towards a robust deep neural network against adversarial texts: A survey

W Wang, R Wang, L Wang, Z Wang… - ieee transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks (DNNs) have achieved remarkable success in various tasks (eg,
image classification, speech recognition, and natural language processing (NLP)). However …

Query-efficient adversarial attack with low perturbation against end-to-end speech recognition systems

S Wang, Z Zhang, G Zhu, X Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
With the widespread use of automated speech recognition (ASR) systems in modern
consumer devices, attack against ASR systems have become an attractive topic in recent …

Neuronfair: Interpretable white-box fairness testing through biased neuron identification

H Zheng, Z Chen, T Du, X Zhang, Y Cheng… - Proceedings of the 44th …, 2022 - dl.acm.org
Deep neural networks (DNNs) have demonstrated their outperformance in various domains.
However, it raises a social concern whether DNNs can produce reliable and fair decisions …

A systematic review of machine learning algorithms in cyberbullying detection: future directions and challenges

M Arif - Journal of Information Security and Cybercrimes …, 2021 - journals.nauss.edu.sa
Social media networks are becoming an essential part of life for most of the world's
population. Detecting cyberbullying using machine learning and natural language …