[BOK][B] Data cleaning

IF Ilyas, X Chu - 2019 - books.google.com
This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …

Trends in cleaning relational data: Consistency and deduplication

IF Ilyas, X Chu - Foundations and Trends® in Databases, 2015 - nowpublishers.com
Data quality is one of the most important problems in data management, since dirty data
often leads to inaccurate data analytics results and wrong business decisions. Poor data …

[HTML][HTML] A logical approach to context-specific independence

J Corander, A Hyttinen, J Kontinen, J Pensar… - Annals of Pure and …, 2019 - Elsevier
Directed acyclic graphs (DAGs) constitute a qualitative representation for conditional
independence (CI) properties of a probability distribution. It is known that every CI statement …

Generalization of typed include dependencies with null values in databases

SV Zykin - information systems, 2023 - elibrary.ru
MSC2020: 68P15 Received July 7, 2023 Research article A er revision August 1, 2023 Full
text in Russian Accepted August 2, 2023 e paper discusses a new type of dependency in …

Declarative cleaning of inconsistencies in information extraction

R Fagin, B Kimelfeld, F Reiss… - ACM Transactions on …, 2016 - dl.acm.org
The population of a predefined relational schema from textual content, commonly known as
Information Extraction (IE), is a pervasive task in contemporary computational challenges …

[HTML][HTML] Reasoning on property graphs with graph generating dependencies

LC Shimomura, N Yakovets, G Fletcher - Information Sciences, 2024 - Elsevier
Data dependencies are a key concept in data management and have been researched in
data integration, data quality and query optimization. With the increasing use of graph …

Querying big data: bridging theory and practice

W Fan, JP Huai - Journal of Computer Science and technology, 2014 - Springer
Big data introduces challenges to query answering, from theory to practice. A number of
questions arise. What queries are" tractable" on big data? How can we make big data" …

Cleaning inconsistencies in information extraction via prioritized repairs

R Fagin, B Kimelfeld, F Reiss… - Proceedings of the 33rd …, 2014 - dl.acm.org
The population of a predefined relational schema from textual content, commonly known as
Information Extraction (IE), is a pervasive task in contemporary computational challenges …

RDFind: Scalable conditional inclusion dependency discovery in RDF datasets

S Kruse, A Jentzsch, T Papenbrock, Z Kaoudi… - Proceedings of the …, 2016 - dl.acm.org
Inclusion dependencies (INDs) form an important integrity constraint on relational
databases, supporting data management tasks, such as join path discovery and query …

[HTML][HTML] Inclusion dependencies and their interaction with functional dependencies in SQL

H Köhler, S Link - Journal of Computer and System Sciences, 2017 - Elsevier
Driven by the SQL standard, we investigate simple and partial inclusion dependencies
(INDs) with not null constraints. Implication of simple INDs and not null constraints is not …