[КНИГА][B] The data matching process
P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …
major steps involved in this process: data pre-processing (cleaning and standardisation) …
Evaluation of entity resolution approaches on real-world match problems
Despite the huge amount of recent research efforts on entity resolution (matching) there has
not yet been a comparative evaluation on the relative effectiveness and efficiency of …
not yet been a comparative evaluation on the relative effectiveness and efficiency of …
Data-Centric Systems and Applications
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …
accessible data source in the world. Web mining aims to discover useful information or …
Frameworks for entity matching: A comparison
Entity matching is a crucial and difficult task for data integration. Entity matching frameworks
provide several methods and their combination to effectively solve different match tasks. In …
provide several methods and their combination to effectively solve different match tasks. In …
[PDF][PDF] 大数据的-个重要方面 数据可用性
**建中, 刘显敏 - 2013 - cs.sjtu.edu.cn
摘要!"# $% &'()*+,-.# $/0 123 4567893:;% &'<=>?@ ABCDEF GFHI# $8 J'KLMN
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …
Cardinality estimation of approximate substring queries using deep learning
Cardinality estimation of an approximate substring query is an important problem in
database systems. Traditional approaches build a summary from the text data and estimate …
database systems. Traditional approaches build a summary from the text data and estimate …
Astrid: accurate selectivity estimation for string predicates using deep learning
Accurate selectivity estimation for string predicates is a long-standing research challenge in
databases. Supporting pattern matching on strings (such as prefix, substring, and suffix) …
databases. Supporting pattern matching on strings (such as prefix, substring, and suffix) …
[HTML][HTML] Parallel set similarity join on big data based on locality-sensitive hashing
Due to the huge amount of involved data and time-consuming process of join operations, the
exact-match joins are rarely used for big data. The most common alternative for exact-match …
exact-match joins are rarely used for big data. The most common alternative for exact-match …
[PDF][PDF] DuDe: The duplicate detection toolkit
Duplicate detection, also known as entity matching or record linkage, was first defined by
Newcombe et al.[19] and has been a research topic for several decades. The challenge is to …
Newcombe et al.[19] and has been a research topic for several decades. The challenge is to …
Comparative evaluation of entity resolution approaches with fever
We present FEVER, a new evaluation platform for entity resolution approaches. The modular
structure of the FEVER framework supports the incorporation or reconstruction of many …
structure of the FEVER framework supports the incorporation or reconstruction of many …