How to dp-fy ml: A practical guide to machine learning with differential privacy

N Ponomareva, H Hazimeh, A Kurakin, Z Xu… - Journal of Artificial …, 2023 - jair.org
Abstract Machine Learning (ML) models are ubiquitous in real-world applications and are a
constant focus of research. Modern ML models have become more complex, deeper, and …

Detecting pretraining data from large language models

W Shi, A Ajith, M **a, Y Huang, D Liu, T Blevins… - arxiv preprint arxiv …, 2023 - arxiv.org
Although large language models (LLMs) are widely deployed, the data used to train them is
rarely disclosed. Given the incredible scale of this data, up to trillions of tokens, it is all but …

LLM Dataset Inference: Did you train on my dataset?

P Maini, H Jia, N Papernot… - Advances in Neural …, 2025 - proceedings.neurips.cc
The proliferation of large language models (LLMs) in the real world has come with a rise in
copyright cases against companies for training their models on unlicensed data from the …

Label poisoning is all you need

R Jha, J Hayase, S Oh - Advances in Neural Information …, 2023 - proceedings.neurips.cc
In a backdoor attack, an adversary injects corrupted data into a model's training dataset in
order to gain control over its predictions on images with a specific attacker-defined trigger. A …

Evaluations of machine learning privacy defenses are misleading

M Aerni, J Zhang, F Tramèr - Proceedings of the 2024 on ACM SIGSAC …, 2024 - dl.acm.org
Empirical defenses for machine learning privacy forgo the provable guarantees of
differential privacy in the hope of achieving higher utility while resisting realistic adversaries …

Do membership inference attacks work on large language models?

M Duan, A Suri, N Mireshghallah, S Min, W Shi… - arxiv preprint arxiv …, 2024 - arxiv.org
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a
member of a target model's training data. Despite extensive research on traditional machine …

Low-cost high-power membership inference attacks

S Zarifzadeh, P Liu, R Shokri - arxiv preprint arxiv:2312.03262, 2023 - arxiv.org
Membership inference attacks aim to detect if a particular data point was used in training a
model. We design a novel statistical test to perform robust membership inference attacks …

Tofu: A task of fictitious unlearning for llms

P Maini, Z Feng, A Schwarzschild, ZC Lipton… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models trained on massive corpora of data from the web can memorize and
reproduce sensitive or private data raising both legal and ethical concerns. Unlearning, or …

Unleashing the power of randomization in auditing differentially private ml

K Pillutla, G Andrew, P Kairouz… - Advances in …, 2023 - proceedings.neurips.cc
We present a rigorous methodology for auditing differentially private machine learning by
adding multiple carefully designed examples called canaries. We take a first principles …

[PDF][PDF] Adversarial machine learning

A Vassilev, A Oprea, A Fordyce, H Anderson - Gaithersburg, MD, 2024 - site.unibo.it
Abstract This NIST Trustworthy and Responsible AI report develops a taxonomy of concepts
and defines terminology in the field of adversarial machine learning (AML). The taxonomy is …