A note on using the F-measure for evaluating record linkage algorithms
Record linkage is the process of identifying and linking records about the same entities from
one or more databases. Record linkage can be viewed as a classification problem where …
one or more databases. Record linkage can be viewed as a classification problem where …
[LIVRE][B] The four generations of entity resolution
Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of
the research examines ways for improving its effectiveness and time efficiency. The initial …
the research examines ways for improving its effectiveness and time efficiency. The initial …
Linking sensitive data
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …
increasing demand to share, integrate, and link such data within and across organisations in …
(Almost) all of entity resolution
Whether the goal is to estimate the number of people that live in a congressional district, to
estimate the number of individuals that have died in an armed conflict, or to disambiguate …
estimate the number of individuals that have died in an armed conflict, or to disambiguate …
Active learning for large-scale entity resolution
Entity resolution (ER) is the task of identifying different representations of the same real-
world object across datasets. Designing and tuning ER algorithms is an error-prone, labor …
world object across datasets. Designing and tuning ER algorithms is an error-prone, labor …
BLOSS: Effective meta-blocking with almost no effort
Record deduplication aims at identifying which records represent the same real-world object
in a dataset. As it is a task naturally quadratic (ie each record is a potential duplicate), a …
in a dataset. As it is a task naturally quadratic (ie each record is a potential duplicate), a …
Place deduplication with embeddings
Thanks to the advancing mobile location services, people nowadays can post about places
to share visiting experience on-the-go. A large place graph not only helps users explore …
to share visiting experience on-the-go. A large place graph not only helps users explore …
Enhancing Entity Resolution with a hybrid Active Machine Learning framework: Strategies for optimal learning in sparse datasets
When solving the problem of identifying similar records in different datasets (known as Entity
Resolution or ER), one big challenge is the lack of enough labeled data. Which is crucial for …
Resolution or ER), one big challenge is the lack of enough labeled data. Which is crucial for …
Entity matching with active monotone classification
Y Tao - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI …, 2018 - dl.acm.org
Given two sets of entities X and Y, entity matching aims to decide whether x and y represent
the same entity for each pair (x, y) ın X x Y. As the last resort, human experts can be called …
the same entity for each pair (x, y) ın X x Y. As the last resort, human experts can be called …
Skyblocking for entity resolution
In this paper, we introduce a novel framework for entity resolution blocking, called
skyblocking, which aims to learn scheme skylines. In this skyblocking framework, each …
skyblocking, which aims to learn scheme skylines. In this skyblocking framework, each …