Profiling relational data: a survey
Profiling data to determine metadata about a given dataset is an important and frequent
activity of any IT professional and researcher and is necessary for various use-cases. It …
activity of any IT professional and researcher and is necessary for various use-cases. It …
Data profiling: A tutorial
is to understand the dataset at hand and its metadata. The process of metadata discovery is
known as data profiling. Profiling activities range from ad-hoc approaches, such as eye …
known as data profiling. Profiling activities range from ad-hoc approaches, such as eye …
Data quality: The other face of big data
In our Big Data era, data is being generated, collected and analyzed at an unprecedented
scale, and data-driven decision making is swee** through all aspects of society. Recent …
scale, and data-driven decision making is swee** through all aspects of society. Recent …
Conditional functional dependencies for capturing data inconsistencies
We propose a class of integrity constraints for relational databases, referred to as conditional
functional dependencies (CFDs), and study their applications in data cleaning. In contrast to …
functional dependencies (CFDs), and study their applications in data cleaning. In contrast to …
Discovering conditional functional dependencies
This paper investigates the discovery of conditional functional dependencies (CFDs). CFDs
are a recent extension of functional dependencies (FDs) by supporting patterns of …
are a recent extension of functional dependencies (FDs) by supporting patterns of …
Data profiling revisited
F Naumann - ACM SIGMOD Record, 2014 - dl.acm.org
Data profiling comprises a broad range of methods to efficiently analyze a given data set. In
a typical scenario, which mirrors the capabilities of commercial data profiling tools, tables of …
a typical scenario, which mirrors the capabilities of commercial data profiling tools, tables of …
[BOK][B] Data profiling
Data profiling refers to the activity of collecting data about data,{ie}, metadata. Most IT
professionals and researchers who work with data have engaged in data profiling, at least …
professionals and researchers who work with data have engaged in data profiling, at least …
Guided data repair
In this paper we present GDR, a Guided Data Repair framework that incorporates user
feedback in the cleaning process to enhance and accelerate existing automatic repair …
feedback in the cleaning process to enhance and accelerate existing automatic repair …
Discovering data quality rules
Dirty data is a serious problem for businesses leading to incorrect decision making,
inefficient daily operations, and ultimately wasting both time and money. Dirty data often …
inefficient daily operations, and ultimately wasting both time and money. Dirty data often …
Dependencies revisited for improving data quality
W Fan - Proceedings of the twenty-seventh ACM SIGMOD …, 2008 - dl.acm.org
Dependency theory is almost as old as relational databases themselves, and has
traditionally been used to improve the quality of schema, among other things. Recently there …
traditionally been used to improve the quality of schema, among other things. Recently there …