Badnl: Backdoor attacks against nlp models with semantic-preserving improvements

X Chen, A Salem, D Chen, M Backes, S Ma… - Proceedings of the 37th …, 2021 - dl.acm.org
Deep neural networks (DNNs) have progressed rapidly during the past decade and have
been deployed in various real-world applications. Meanwhile, DNN models have been …

Unbiased watermark for large language models

Z Hu, L Chen, X Wu, Y Wu, H Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent advancements in large language models (LLMs) have sparked a growing
apprehension regarding the potential misuse. One approach to mitigating this risk is to …

Unraveling Attacks to Machine Learning-Based IoT Systems: A Survey and the Open Libraries Behind Them

C Liu, B Chen, W Shao, C Zhang… - IEEE Internet of …, 2024 - ieeexplore.ieee.org
The advent of the Internet of Things (IoT) has brought forth an era of unprecedented
connectivity, with an estimated 80 billion smart devices expected to be in operation by the …

“real attackers don't compute gradients”: bridging the gap between adversarial ml research and practice

G Apruzzese, HS Anderson, S Dambra… - … IEEE Conference on …, 2023 - ieeexplore.ieee.org
Recent years have seen a proliferation of research on adversarial machine learning.
Numerous papers demonstrate powerful algorithmic attacks against a wide variety of …

Your attack is too dumb: Formalizing attacker scenarios for adversarial transferability

M Alecci, M Conti, F Marchiori, L Martinelli… - Proceedings of the 26th …, 2023 - dl.acm.org
Evasion attacks are a threat to machine learning models, where adversaries attempt to affect
classifiers by injecting malicious samples. An alarming side-effect of evasion attacks is their …

GONE: A generic O (1) NoisE layer for protecting privacy of deep neural networks

H Zheng, J Chen, W Shangguan, Z Ming, X Yang… - Computers & …, 2023 - Elsevier
With the wide applications of deep neural networks (DNNs) in various fields, current
research shows their serious security risks due to the lack of privacy protection. Observing …

Watermark smoothing attacks against language models

H Chang, H Hassani, R Shokri - arxiv preprint arxiv:2407.14206, 2024 - arxiv.org
Watermarking is a technique used to embed a hidden signal in the probability distribution of
text generated by large language models (LLMs), enabling attribution of the text to the …

Evaluating the noise tolerance of Cloud NLP services across Amazon, Microsoft, and Google

J Barbosa, B Fonseca, M Ribeiro, J Correia… - Computers in …, 2025 - Elsevier
Abstract Natural Language Processing (NLP) has revolutionized industries, streamlining
customer service through applications in healthcare, finance, legal, and human resources …

[HTML][HTML] Dual adversarial attacks: Fooling humans and classifiers

J Schneider, G Apruzzese - Journal of Information Security and Applications, 2023 - Elsevier
Adversarial samples mostly aim at fooling machine learning (ML) models. They often involve
minor pixel-based perturbations that are imperceptible to human observers. In this work …

Privacy Implications of Explainable AI in Data-Driven Systems

F Ezzeddine - arxiv preprint arxiv:2406.15789, 2024 - arxiv.org
Machine learning (ML) models, demonstrably powerful, suffer from a lack of interpretability.
The absence of transparency, often referred to as the black box nature of ML models …