How to dp-fy ml: A practical guide to machine learning with differential privacy

N Ponomareva, H Hazimeh, A Kurakin, Z Xu… - Journal of Artificial …, 2023 - jair.org
Abstract Machine Learning (ML) models are ubiquitous in real-world applications and are a
constant focus of research. Modern ML models have become more complex, deeper, and …

Fine-tuning large language models with user-level differential privacy

Z Charles, A Ganesh, R McKenna… - arxiv preprint arxiv …, 2024 - arxiv.org
We investigate practical and scalable algorithms for training large language models (LLMs)
with user-level differential privacy (DP) in order to provably safeguard all the examples …

Efficient and near-optimal noise generation for streaming differential privacy

KD Dvijotham, HB McMahan, K Pillutla… - 2024 IEEE 65th …, 2024 - ieeexplore.ieee.org
In the task of differentially private (DP) continual counting, we receive a stream of increments
and our goal is to output an approximate running total of these increments, without revealing …

Correlated noise provably beats independent noise for differentially private learning

CA Choquette-Choo, K Dvijotham, K Pillutla… - arxiv preprint arxiv …, 2023 - arxiv.org
Differentially private learning algorithms inject noise into the learning process. While the
most common private learning algorithm, DP-SGD, adds independent Gaussian noise in …

Teach llms to phish: Stealing private information from language models

A Panda, CA Choquette-Choo, Z Zhang, Y Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
When large language models are trained on private data, it can be a significant privacy risk
for them to memorize and regurgitate sensitive information. In this work, we propose a new …

Efficient language model architectures for differentially private federated learning

JH Ro, S Bhojanapalli, Z Xu, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Cross-device federated learning (FL) is a technique that trains a model on data distributed
across typically millions of edge devices without data leaving the devices. SGD is the …

Robust and actively secure serverless collaborative learning

N Franzese, A Dziedzic… - Advances in …, 2024 - proceedings.neurips.cc
Collaborative machine learning (ML) is widely used to enable institutions to learn better
models from distributed data. While collaborative approaches to learning intuitively protect …

Federated learning in practice: reflections and projections

K Daly, H Eichner, P Kairouz… - 2024 IEEE 6th …, 2024 - ieeexplore.ieee.org
Federated Learning (FL) is a machine learning technique that enables multiple entities to
collaboratively learn a shared model without exchanging their local data. Over the past …

Privacy amplification for matrix mechanisms

CA Choquette-Choo, A Ganesh, T Steinke… - arxiv preprint arxiv …, 2023 - arxiv.org
Privacy amplification exploits randomness in data selection to provide tighter differential
privacy (DP) guarantees. This analysis is key to DP-SGD's success in machine learning, but …

Banded square root matrix factorization for differentially private model training

NP Kalinin, C Lampert - arxiv preprint arxiv:2405.13763, 2024 - arxiv.org
Current state-of-the-art methods for differentially private model training are based on matrix
factorization techniques. However, these methods suffer from high computational overhead …