Data and information quality
C Batini, M Scannapieco - Cham, Switzerland: Springer International …, 2016 - Springer
This book is the result of a study path that started in 2006, when the two authors of this book
published the book Data Quality: Concepts, Methodologies and Techniques. After 8 years …
published the book Data Quality: Concepts, Methodologies and Techniques. After 8 years …
[LIBRO][B] Foundations of data quality management
Data quality is one of the most important problems in data management. A database system
typically aims to support the creation, maintenance and use of large amount of data …
typically aims to support the creation, maintenance and use of large amount of data …
Auto-em: End-to-end fuzzy entity-matching using pre-trained deep models and transfer learning
Entity matching (EM), also known as entity resolution, fuzzy join, and record linkage, refers to
the process of identifying records corresponding to the same real-world entities from …
the process of identifying records corresponding to the same real-world entities from …
Reasoning about record matching rules
To accurately match records it is often necessary to utilize the semantics of the data.
Functional dependencies (FDs) have proven useful in identifying tuples in a clean relation …
Functional dependencies (FDs) have proven useful in identifying tuples in a clean relation …
Large-scale deduplication with constraints using dedupalog
We present a declarative framework for collective deduplication of entity references in the
presence of constraints. Constraints occur naturally in many data cleaning domains and can …
presence of constraints. Constraints occur naturally in many data cleaning domains and can …
Differential dependencies: Reasoning and discovery
The importance of difference semantics (eg,“similar” or “dissimilar”) has been recently
recognized for declaring dependencies among various types of data, such as numerical …
recognized for declaring dependencies among various types of data, such as numerical …
[LIBRO][B] Data Cleaning
V Ganti, AD Sarma - 2022 - books.google.com
Data warehouses consolidate various activities of a business and often form the backbone
for generating reports that support important business decisions. Errors in data tend to creep …
for generating reports that support important business decisions. Errors in data tend to creep …
Efficient approximate entity extraction with edit distance constraints
Named entity recognition aims at extracting named entities from unstructured text. A recent
trend of named entity recognition is finding approximate matches in the text with respect to a …
trend of named entity recognition is finding approximate matches in the text with respect to a …
Dynamic constraints for record matching
This paper investigates constraints for matching records from unreliable data sources.(a) We
introduce a class of matching dependencies (md s) for specifying the semantics of unreliable …
introduce a class of matching dependencies (md s) for specifying the semantics of unreliable …
[PDF][PDF] 大数据的-个重要方面 数据可用性
**建中, 刘显敏 - 计算机研究与发展, 2013 - cs.sjtu.edu.cn
摘要!"# $% &'()*+,-.# $/0 123 4567893:;% &'<=>?@ ABCDEF GFHI# $8 J'KLMN
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …