An overview of end-to-end entity resolution for big data
One of the most critical tasks for improving data quality and increasing the reliability of data
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …
[BUKU][B] Foundations of data quality management
Data quality is one of the most important problems in data management. A database system
typically aims to support the creation, maintenance and use of large amount of data …
typically aims to support the creation, maintenance and use of large amount of data …
Big graphs: challenges and opportunities
W Fan - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Big data is typically characterized with 4V's: Volume, Velocity, Variety and Veracity. When it
comes to big graphs, these challenges become even more staggering. Each and every of …
comes to big graphs, these challenges become even more staggering. Each and every of …
The LLUNATIC data-cleaning framework
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related
tasks. It consists in making a database consistent with respect to a set of given constraints. In …
tasks. It consists in making a database consistent with respect to a set of given constraints. In …
Relaxed functional dependencies—a survey of approaches
Recently, there has been a renovated interest in functional dependencies due to the
possibility of employing them in several advanced database operations, such as data …
possibility of employing them in several advanced database operations, such as data …
Towards certain fixes with editing rules and master data
A variety of integrity constraints have been studied for data cleaning. While these constraints
can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed …
can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed …
Data quality: From theory to practice
W Fan - Acm Sigmod Record, 2015 - dl.acm.org
Data quantity and data quality, like two sides of a coin, are equally important to data
management. This paper provides an overview of recent advances in the study of data …
management. This paper provides an overview of recent advances in the study of data …
Interaction between record matching and data repairing
Central to a data cleaning system are record matching and data repairing. Matching aims to
identify tuples that refer to the same real-world object, and repairing is to make a database …
identify tuples that refer to the same real-world object, and repairing is to make a database …
[PDF][PDF] 大数据的-个重要方面 数据可用性
**建中, 刘显敏 - 计算机研究与发展, 2013 - cs.sjtu.edu.cn
摘要!"# $% &'()*+,-.# $/0 123 4567893:;% &'<=>?@ ABCDEF GFHI# $8 J'KLMN
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …
Data quality and explainable AI
In this work, we provide some insights and develop some ideas, with few technical details,
about the role of explanations in Data Quality in the context of data-based machine learning …
about the role of explanations in Data Quality in the context of data-based machine learning …