How to dp-fy ml: A practical guide to machine learning with differential privacy

N Ponomareva, H Hazimeh, A Kurakin, Z Xu… - Journal of Artificial …, 2023 - jair.org
Abstract Machine Learning (ML) models are ubiquitous in real-world applications and are a
constant focus of research. Modern ML models have become more complex, deeper, and …

On provable copyright protection for generative models

N Vyas, SM Kakade, B Barak - International conference on …, 2023 - proceedings.mlr.press
There is a growing concern that learned conditional generative models may output samples
that are substantially similar to some copyrighted data $ C $ that was in their training set. We …

Differentially private natural language models: Recent advances and future directions

L Hu, I Habernal, L Shen, D Wang - arxiv preprint arxiv:2301.09112, 2023 - arxiv.org
Recent developments in deep learning have led to great success in various natural
language processing (NLP) tasks. However, these applications may involve data that …

Privacy side channels in machine learning systems

E Debenedetti, G Severi, N Carlini… - 33rd USENIX Security …, 2024 - usenix.org
Most current approaches for protecting privacy in machine learning (ML) assume that
models exist in a vacuum. Yet, in reality, these models are part of larger systems that include …

Can public large language models help private cross-device federated learning?

B Wang, YJ Zhang, Y Cao, B Li, HB McMahan… - arxiv preprint arxiv …, 2023 - arxiv.org
We study (differentially) private federated learning (FL) of language models. The language
models in cross-device FL are relatively small, which can be trained with meaningful formal …

Identifying and mitigating privacy risks stemming from language models: A survey

V Smith, AS Shamsabadi, C Ashurst… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have shown greatly enhanced performance in recent years,
attributed to increased size and extensive training data. This advancement has led to …

Purifying large language models by ensembling a small language model

T Li, Q Liu, T Pang, C Du, Q Guo, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
The emerging success of large language models (LLMs) heavily relies on collecting
abundant training data from external (untrusted) sources. Despite substantial efforts devoted …

Vip: A differentially private foundation model for computer vision

Y Yu, M Sanjabi, Y Ma, K Chaudhuri, C Guo - arxiv preprint arxiv …, 2023 - arxiv.org
Artificial intelligence (AI) has seen a tremendous surge in capabilities thanks to the use of
foundation models trained on internet-scale data. On the flip side, the uncurated nature of …

Textfusion: Privacy-preserving pre-trained model inference via token fusion

X Zhou, J Lu, T Gui, R Ma, Z Fei, Y Wang… - Proceedings of the …, 2022 - aclanthology.org
Recently, more and more pre-trained language models are released as a cloud service. It
allows users who lack computing resources to perform inference with a powerful model by …

Assessing privacy risks in language models: A case study on summarization tasks

R Tang, G Lueck, R Quispe, HA Inan, J Kulkarni… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models have revolutionized the field of NLP by achieving state-of-the-art
performance on various tasks. However, there is a concern that these models may disclose …