[HTML][HTML] Challenges and opportunities of generative models on tabular data

AX Wang, SS Chukova, CR Simpson, BP Nguyen - Applied Soft Computing, 2024 - Elsevier
Tabular data, organized like tables with rows and columns, is widely used. Existing models
for tabular data synthesis often face limitations related to data size or complexity. In contrast …

GAN-Based Tabular Data Generator for Constructing Synopsis in Approximate Query Processing: Challenges and Solutions

M Fallahian, M Dorodchi, K Kreth - Machine Learning and Knowledge …, 2024 - mdpi.com
In data-driven systems, data exploration is imperative for making real-time decisions.
However, big data are stored in massive databases that are difficult to retrieve. Approximate …

Evaluating the performance of automated machine learning (AutoML) tools for heart disease diagnosis and prediction

LM Paladino, A Hughes, A Perera, O Topsakal… - AI, 2023 - mdpi.com
Globally, over 17 million people annually die from cardiovascular diseases, with heart
disease being the leading cause of mortality in the United States. The ever-increasing …

Evaluating the Utility and Privacy of Synthetic Breast Cancer Clinical Trial Data Sets

S El Kababji, N Mitsakakis, X Fang… - JCO Clinical Cancer …, 2023 - ascopubs.org
PURPOSE There is strong interest from patients, researchers, the pharmaceutical industry,
medical journal editors, funders of research, and regulators in sharing clinical trial data for …

[HTML][HTML] Automating attendance management in human resources: A design science approach using computer vision and facial recognition

BT Nguyen-Tat, MQ Bui, VM Ngo - International Journal of Information …, 2024 - Elsevier
Haar Cascade is a cost-effective and user-friendly machine learning-based algorithm for
detecting objects in images and videos. Unlike Deep Learning algorithms, which typically …

An evaluation of synthetic data augmentation for mitigating covariate bias in health data

L Juwara, A El-Hussuna, K El Emam - Patterns, 2024 - cell.com
Data bias is a major concern in biomedical research, especially when evaluating large-scale
observational datasets. It leads to imprecise predictions and inconsistent estimates in …

[HTML][HTML] Can I trust my fake data–A comprehensive quality assessment framework for synthetic tabular data in healthcare

VB Vallevik, A Babic, SE Marshall, E Severin… - International Journal of …, 2024 - Elsevier
Background Ensuring safe adoption of AI tools in healthcare hinges on access to sufficient
data for training, testing and validation. Synthetic data has been suggested in response to …

Synthetic census microdata generation: A comparative study of synthesis methods examining the trade-off between disclosure risk and utility

C Little, R Allmendinger, M Elliot - Journal of Official Statistics, 2024 - journals.sagepub.com
There is growing interest in synthetic data generation as a means of allowing access to
useful data whilst preserving confidentiality. In particular, synthetic microdata generation …

[HTML][HTML] Advancing student outcome predictions through generative adversarial networks

H Farhood, I Joudah, A Beheshti, S Muller - Computers and Education …, 2024 - Elsevier
Predicting student outcomes is essential in educational analytics for creating personalised
learning experiences. The effectiveness of these predictive models relies on having access …

Convex space learning for tabular synthetic data generation

M Mahendra, C Umesh, S Bej, K Schultz… - arxiv preprint arxiv …, 2024 - arxiv.org
Generating synthetic samples from the convex space of the minority class is a popular
oversampling approach for imbalanced classification problems. Recently, deep-learning …