[BOK][B] Data cleaning
This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …
important problems in data management, since dirty data often leads to inaccurate data …
Trends in cleaning relational data: Consistency and deduplication
Data quality is one of the most important problems in data management, since dirty data
often leads to inaccurate data analytics results and wrong business decisions. Poor data …
often leads to inaccurate data analytics results and wrong business decisions. Poor data …
[HTML][HTML] A logical approach to context-specific independence
Directed acyclic graphs (DAGs) constitute a qualitative representation for conditional
independence (CI) properties of a probability distribution. It is known that every CI statement …
independence (CI) properties of a probability distribution. It is known that every CI statement …
Generalization of typed include dependencies with null values in databases
SV Zykin - information systems, 2023 - elibrary.ru
MSC2020: 68P15 Received July 7, 2023 Research article A er revision August 1, 2023 Full
text in Russian Accepted August 2, 2023 e paper discusses a new type of dependency in …
text in Russian Accepted August 2, 2023 e paper discusses a new type of dependency in …
Declarative cleaning of inconsistencies in information extraction
The population of a predefined relational schema from textual content, commonly known as
Information Extraction (IE), is a pervasive task in contemporary computational challenges …
Information Extraction (IE), is a pervasive task in contemporary computational challenges …
[HTML][HTML] Reasoning on property graphs with graph generating dependencies
Data dependencies are a key concept in data management and have been researched in
data integration, data quality and query optimization. With the increasing use of graph …
data integration, data quality and query optimization. With the increasing use of graph …
Querying big data: bridging theory and practice
W Fan, JP Huai - Journal of Computer Science and technology, 2014 - Springer
Big data introduces challenges to query answering, from theory to practice. A number of
questions arise. What queries are" tractable" on big data? How can we make big data" …
questions arise. What queries are" tractable" on big data? How can we make big data" …
Cleaning inconsistencies in information extraction via prioritized repairs
The population of a predefined relational schema from textual content, commonly known as
Information Extraction (IE), is a pervasive task in contemporary computational challenges …
Information Extraction (IE), is a pervasive task in contemporary computational challenges …
RDFind: Scalable conditional inclusion dependency discovery in RDF datasets
Inclusion dependencies (INDs) form an important integrity constraint on relational
databases, supporting data management tasks, such as join path discovery and query …
databases, supporting data management tasks, such as join path discovery and query …
[HTML][HTML] Inclusion dependencies and their interaction with functional dependencies in SQL
Driven by the SQL standard, we investigate simple and partial inclusion dependencies
(INDs) with not null constraints. Implication of simple INDs and not null constraints is not …
(INDs) with not null constraints. Implication of simple INDs and not null constraints is not …