Systematic review of advanced AI methods for improving healthcare data quality in post COVID-19 Era

M Isgut, L Gloster, K Choi… - IEEE Reviews in …, 2022 - ieeexplore.ieee.org
At the beginning of the COVID-19 pandemic, there was significant hype about the potential
impact of artificial intelligence (AI) tools in combatting COVID-19 on diagnosis, prognosis, or …

[HTML][HTML] Transformative strategies in photocatalyst design: merging computational methods and deep learning

J Liu, L Liang, B Su, D Wu, Y Zhang… - Journal of Materials …, 2024 - oaepublish.com
Photocatalysis is a unique technology that harnesses solar energy through in-situ
processes, operating without the need for external energy inputs. It is integral to advancing …

Goodcore: Data-effective and data-efficient machine learning through coreset selection over incomplete data

C Chai, J Liu, N Tang, J Fan, D Miao, J Wang… - Proceedings of the …, 2023 - dl.acm.org
Given a dataset with incomplete data (eg, missing values), training a machine learning
model over the incomplete data requires two steps. First, it requires a data-effective step that …

An integrated network architecture for data repair and degradation trend prediction

Q Yang, B Tang, S Yang, Y Shen - Mechanical Systems and Signal …, 2023 - Elsevier
This paper proposed a network framework, namely DR-DTPN, which integrates data repair
and degradation trend prediction to address the serious deviation of equipment degradation …

Parker: Data fusion through consistent repairs using edit rules under partial keys

A Bronselaer, M Acosta - Information Fusion, 2023 - Elsevier
Data integration is the problem of consolidating information provided by multiple sources.
After schema map** and duplicate detection have been dealt with, the problem consists in …

Data cleaning and machine learning: a systematic literature review

PO Côté, A Nikanjam, N Ahmed, D Humeniuk… - Automated Software …, 2024 - Springer
Abstract Machine Learning (ML) is integrated into a growing number of systems for various
applications. Because the performance of an ML model is highly dependent on the quality of …

RLclean: An unsupervised integrated data cleaning framework based on deep reinforcement learning

J Peng, D Shen, T Nie, Y Kou - Information Sciences, 2024 - Elsevier
Data cleaning, a prerequisite to subsequent data analysis, has always been the focus of
data science research. Datasets with errors can severely detract from the quality of …

Lincqa: Faster consistent query answering with linear time guarantees

Z Fan, P Koutris, X Ouyang, J Wijsen - … of the ACM on Management of …, 2023 - dl.acm.org
Most data analytical pipelines often encounter the problem of querying inconsistent data that
violate pre-determined integrity constraints. Data cleaning is an extensively studied …

Automatic data repair: Are we ready to deploy?

W Ni, X Miao, X Zhao, Y Wu, J Yin - arxiv preprint arxiv:2310.00711, 2023 - arxiv.org
Data quality is paramount in today's data-driven world, especially in the era of generative AI.
Dirty data with errors and inconsistencies usually leads to flawed insights, unreliable …

[HTML][HTML] Natural generative noise diffusion model imputation

A Wibisono, P Mursanto, S See - Knowledge-Based Systems, 2024 - Elsevier
Imputation is a critical method for enhancing dataset quality, essential for ensuring accurate
analysis and insights. This research proposes an advanced imputation algorithm utilizing a …