Big data quality framework: a holistic approach to continuous quality management

I Taleb, MA Serhani, C Bouhaddioui, R Dssouli - Journal of Big Data, 2021 - Springer
Big Data is an essential research area for governments, institutions, and private agencies to
support their analytics decisions. Big Data refers to all about data, how it is collected …

VerifAI: verified generative AI

N Tang, C Yang, J Fan, L Cao, Y Luo… - arxiv preprint arxiv …, 2023 - arxiv.org
Generative AI has made significant strides, yet concerns about the accuracy and reliability of
its outputs continue to grow. Such inaccuracies can have serious consequences such as …

Big data pre-processing: A quality framework

I Taleb, R Dssouli, MA Serhani - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
With the abundance of raw data generated from various sources, Big Data has become a
preeminent approach in acquiring, processing, and analyzing large amounts of …

Towards dependable data repairing with fixing rules

J Wang, N Tang - Proceedings of the 2014 ACM SIGMOD international …, 2014 - dl.acm.org
One of the main challenges that data cleaning systems face is to automatically identify and
repair data errors in a dependable manner. Though data dependencies (aka integrity …

Towards reliable interactive data cleaning: A user survey and recommendations

S Krishnan, D Haas, MJ Franklin, E Wu - … of the Workshop on Human-In …, 2016 - dl.acm.org
Data cleaning is frequently an iterative process tailored to the requirements of a specific
analysis task. The design and implementation of iterative data cleaning tools presents novel …

Self-supervised and interpretable data cleaning with sequence generative adversarial networks

J Peng, D Shen, N Tang, T Liu, Y Kou, T Nie… - Proceedings of the …, 2022 - dl.acm.org
We study the problem of self-supervised and interpretable data cleaning, which
automatically extracts interpretable data repair rules from dirty data. In this paper, we …

Interactive and deterministic data cleaning

J He, E Veltri, D Santoro, G Li, G Mecca… - Proceedings of the …, 2016 - dl.acm.org
We present Falcon, an interactive, deterministic, and declarative data cleaning system,
which uses SQL update queries as the language to repair data. Falcon does not rely on the …

RLclean: An unsupervised integrated data cleaning framework based on deep reinforcement learning

J Peng, D Shen, T Nie, Y Kou - Information Sciences, 2024 - Elsevier
Data cleaning, a prerequisite to subsequent data analysis, has always been the focus of
data science research. Datasets with errors can severely detract from the quality of …

An hybrid approach to quality evaluation across big data value chain

MA Serhani, HT El Kassabi, I Taleb… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
While the potential benefits of Big Data adoption are significant, and some initial successes
have already been realized, there remain many research and technical challenges that must …

Big data cleaning

N Tang - Asia-Pacific Web Conference, 2014 - Springer
Data cleaning is, in fact, a lively subject that has played an important part in the history of
data management and data analytics, and it still is undergoing rapid development …