A note on using the F-measure for evaluating record linkage algorithms

D Hand, P Christen - Statistics and Computing, 2018 - Springer
Record linkage is the process of identifying and linking records about the same entities from
one or more databases. Record linkage can be viewed as a classification problem where …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

(Almost) all of entity resolution

O Binette, RC Steorts - Science Advances, 2022 - science.org
Whether the goal is to estimate the number of people that live in a congressional district, to
estimate the number of individuals that have died in an armed conflict, or to disambiguate …

Active learning for large-scale entity resolution

K Qian, L Popa, P Sen - Proceedings of the 2017 ACM on Conference on …, 2017 - dl.acm.org
Entity resolution (ER) is the task of identifying different representations of the same real-
world object across datasets. Designing and tuning ER algorithms is an error-prone, labor …

Enhancing entity resolution with a hybrid active machine learning framework: Strategies for optimal learning in sparse datasets

M Jabrane, H Tabbaa, A Hadri, I Hafidi - Information Systems, 2024 - Elsevier
When solving the problem of identifying similar records in different datasets (known as Entity
Resolution or ER), one big challenge is the lack of enough labeled data. Which is crucial for …

BLOSS: Effective meta-blocking with almost no effort

G Dal Bianco, MA Gonçalves, D Duarte - Information Systems, 2018 - Elsevier
Record deduplication aims at identifying which records represent the same real-world object
in a dataset. As it is a task naturally quadratic (ie each record is a potential duplicate), a …

ERABQS: entity resolution based on active machine learning and balancing query strategy

J Mourad, T Hiba, R Yassir, H Imad - Journal of Intelligent Information …, 2024 - Springer
Entity Resolution (ER) is a crucial process in the field of data management and integration.
The primary goal of ER is to identify different profiles (or records) that refer to the same real …

Place deduplication with embeddings

C Yang, DH Hoang, T Mikolov, J Han - The World Wide Web Conference, 2019 - dl.acm.org
Thanks to the advancing mobile location services, people nowadays can post about places
to share visiting experience on-the-go. A large place graph not only helps users explore …

Entity matching with active monotone classification

Y Tao - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI …, 2018 - dl.acm.org
Given two sets of entities X and Y, entity matching aims to decide whether x and y represent
the same entity for each pair (x, y) ın X x Y. As the last resort, human experts can be called …

Privacy-preserving temporal record linkage

T Ranbaduge, P Christen - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Record linkage (RL) is the process of identifying matching records from different databases
that refer to the same entity. It is common that the attribute values of records that belong to …