[HTML][HTML] Can i trust my fake data–a comprehensive quality assessment framework for synthetic tabular data in healthcare

VB Vallevik, A Babic, SE Marshall, E Severin… - International Journal of …, 2024 - Elsevier
Background Ensuring safe adoption of AI tools in healthcare hinges on access to sufficient
data for training, testing and validation. Synthetic data has been suggested in response to …

Tabular and latent space synthetic data generation: a literature review

J Fonseca, F Bacao - Journal of Big Data, 2023 - Springer
The generation of synthetic data can be used for anonymization, regularization,
oversampling, semi-supervised learning, self-supervised learning, and several other tasks …

Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence

JN Eckardt, W Hahn, C Röllig, S Stasik… - NPJ digital …, 2024 - nature.com
Clinical research relies on high-quality patient data, however, obtaining big data sets is
costly and access to existing data is often hindered by privacy and regulatory concerns …

An evaluation framework for synthetic data generation models

IE Livieris, N Alimpertis, G Domalis… - … Conference on Artificial …, 2024 - Springer
Nowadays, the use of synthetic data has gained popularity as a cost-efficient strategy for
enhancing data augmentation for improving machine learning models performance as well …

Can we trust synthetic data in medicine? A sco** review of privacy and utility metrics

B Kaabachi, J Despraz, T Meurers, K Otte, M Halilovic… - medRxiv, 2023 - medrxiv.org
Introduction Sharing and re-using health-related data beyond the scope of its initial
collection is essential for accelerating research, develo** robust and trustworthy machine …

A survey on data synthesis and augmentation for large language models

K Wang, J Zhu, M Ren, Z Liu, S Li, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
The success of Large Language Models (LLMs) is inherently linked to the availability of vast,
diverse, and high-quality data for training and evaluation. However, the growth rate of high …

Thinking in Categories: A Survey on Assessing the Quality for Time Series Synthesis

M Stenger, A Bauer, T Prantl, R Leppich… - ACM Journal of Data …, 2024 - dl.acm.org
Time series data are widely used and provide a wealth of information for countless
applications. However, some applications are faced with a limited amount of data, or the …

[HTML][HTML] Statistical validation of synthetic data for lung cancer patients generated by using generative adversarial networks

L Gonzalez-Abril, C Angulo, JA Ortega… - Electronics, 2022 - mdpi.com
The development of healthcare patient digital twins in combination with machine learning
technologies helps doctors in therapeutic prescription and in minimally invasive intervention …

Exploring innovative approaches to synthetic tabular data generation

E Papadaki, AG Vrahatis, S Kotsiantis - Electronics, 2024 - mdpi.com
The rapid advancement of data generation techniques has spurred innovation across
multiple domains. This comprehensive review delves into the realm of data generation …

MargCTGAN: A “Marginally” Better CTGAN for the Low Sample Regime

T Afonja, D Chen, M Fritz - DAGM German Conference on Pattern …, 2023 - Springer
The potential of realistic and useful synthetic data is significant. However, current evaluation
methods for synthetic tabular data generation predominantly focus on downstream task …