Machine learning for synthetic data generation: a review
Y Lu, M Shen, H Wang, X Wang, C van Rechem… - ar** review of privacy and utility metrics in medical synthetic data
The use of synthetic data is a promising solution to facilitate the sharing and reuse of health-
related data beyond its initial collection while addressing privacy concerns. However, there …
related data beyond its initial collection while addressing privacy concerns. However, there …
Private synthetic data for multitask learning and marginal queries
We provide a differentially private algorithm for producing synthetic data simultaneously
useful for multiple tasks: marginal queries and multitask machine learning (ML). A key …
useful for multiple tasks: marginal queries and multitask machine learning (ML). A key …
Generating private synthetic data with genetic algorithms
We study the problem of efficiently generating differentially private synthetic data that
approximate the statistical properties of an underlying sensitive dataset. In recent years …
approximate the statistical properties of an underlying sensitive dataset. In recent years …
Post-processing private synthetic data for improving utility on selected measures
Existing private synthetic data generation algorithms are agnostic to downstream tasks.
However, end users may have specific requirements that the synthetic data must satisfy …
However, end users may have specific requirements that the synthetic data must satisfy …
Graphical vs. Deep Generative Models: Measuring the Impact of Differentially Private Mechanisms and Budgets on Utility
Generative models trained with Differential Privacy (DP) can produce synthetic data while
reducing privacy risks. However, navigating their privacy-utility tradeoffs makes finding the …
reducing privacy risks. However, navigating their privacy-utility tradeoffs makes finding the …
An optimal and scalable matrix mechanism for noisy marginals under convex loss functions
Noisy marginals are a common form of confidentiality-protecting data release and are useful
for many downstream tasks such as contingency table analysis, construction of Bayesian …
for many downstream tasks such as contingency table analysis, construction of Bayesian …
Towards principled assessment of tabular data synthesis algorithms
Data synthesis has been advocated as an important approach for utilizing data while
protecting data privacy. A large number of tabular data synthesis algorithms (which we call …
protecting data privacy. A large number of tabular data synthesis algorithms (which we call …