A note on using the F-measure for evaluating record linkage algorithms

D Hand, P Christen - Statistics and Computing, 2018 - Springer
Record linkage is the process of identifying and linking records about the same entities from
one or more databases. Record linkage can be viewed as a classification problem where …

[LIVRE][B] The four generations of entity resolution

Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of
the research examines ways for improving its effectiveness and time efficiency. The initial …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

(Almost) all of entity resolution

O Binette, RC Steorts - Science Advances, 2022 - science.org
Whether the goal is to estimate the number of people that live in a congressional district, to
estimate the number of individuals that have died in an armed conflict, or to disambiguate …

Active learning for large-scale entity resolution

K Qian, L Popa, P Sen - Proceedings of the 2017 ACM on Conference on …, 2017 - dl.acm.org
Entity resolution (ER) is the task of identifying different representations of the same real-
world object across datasets. Designing and tuning ER algorithms is an error-prone, labor …

BLOSS: Effective meta-blocking with almost no effort

G Dal Bianco, MA Gonçalves, D Duarte - Information Systems, 2018 - Elsevier
Record deduplication aims at identifying which records represent the same real-world object
in a dataset. As it is a task naturally quadratic (ie each record is a potential duplicate), a …

Place deduplication with embeddings

C Yang, DH Hoang, T Mikolov, J Han - The World Wide Web Conference, 2019 - dl.acm.org
Thanks to the advancing mobile location services, people nowadays can post about places
to share visiting experience on-the-go. A large place graph not only helps users explore …

Enhancing Entity Resolution with a hybrid Active Machine Learning framework: Strategies for optimal learning in sparse datasets

M Jabrane, H Tabbaa, A Hadri, I Hafidi - Information Systems, 2024 - Elsevier
When solving the problem of identifying similar records in different datasets (known as Entity
Resolution or ER), one big challenge is the lack of enough labeled data. Which is crucial for …

Entity matching with active monotone classification

Y Tao - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI …, 2018 - dl.acm.org
Given two sets of entities X and Y, entity matching aims to decide whether x and y represent
the same entity for each pair (x, y) ın X x Y. As the last resort, human experts can be called …

Skyblocking for entity resolution

J Shao, Q Wang, Y Lin - Information Systems, 2019 - Elsevier
In this paper, we introduce a novel framework for entity resolution blocking, called
skyblocking, which aims to learn scheme skylines. In this skyblocking framework, each …