Privacy-preserving record linkage for big data: Current approaches and research challenges

D Vatsalan, Z Sehili, P Christen, E Rahm - Handbook of big data …, 2017 - Springer
Abstract The growth of Big Data, especially personal data dispersed in multiple data
sources, presents enormous opportunities and insights for businesses to explore and …

Modern privacy-preserving record linkage techniques: An overview

A Gkoulalas-Divanis, D Vatsalan… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Record linkage is the challenging task of deciding which records, coming from disparate
data sources, refer to the same entity. Established back in 1946 by Halbert L. Dunn, the area …

Deep learning for entity matching: A design space exploration

S Mudgal, H Li, T Rekatsinas, AH Doan… - Proceedings of the …, 2018 - dl.acm.org
Entity matching (EM) finds data instances that refer to the same real-world entity. In this
paper we examine applying deep learning (DL) to EM, to understand DL's benefits and …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

On the accuracy and scalability of probabilistic data linkage over the Brazilian 114 million cohort

R Pita, C Pinto, S Sena, R Fiaccone… - IEEE journal of …, 2018 - ieeexplore.ieee.org
Data linkage refers to the process of identifying and linking records that refer to the same
entity across multiple heterogeneous data sources. This method has been widely utilized …

Cordel: a contrastive deep learning approach for entity linkage

Z Wang, B Sisman, H Wei, XL Dong… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Entity linkage (EL) is a critical problem in data cleaning and integration. In the past several
decades, EL has typically been done by rule-based systems or traditional machine learning …

The past, present and future of the German Record Linkage Center (GRLC)

M Antoni, R Schnell - Jahrbücher für Nationalökonomie und Statistik, 2019 - degruyter.com
Linking data on the same units (such as persons, enterprises or patents) is an increasingly
popular research strategy, also in the social sciences (Schnell, 2014b). Since in many cases …

Incremental clustering techniques for multi-party privacy-preserving record linkage

D Vatsalan, P Christen, E Rahm - Data & Knowledge Engineering, 2020 - Elsevier
Abstract Privacy-Preserving Record Linkage (PPRL) supports the integration of sensitive
information from multiple datasets, in particular the privacy-preserving matching of records …

Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets

AP Brown, C Borgs, SM Randall, R Schnell - BMC medical informatics and …, 2017 - Springer
Background Integrating medical data using databases from different sources by record
linkage is a powerful technique increasingly used in medical research. Under many …

[PDF][PDF] Parallel Privacy-preserving Record Linkage using LSH-based Blocking.

M Franke, Z Sehili, E Rahm - IoTBDS, 2018 - scitepress.org
Privacy-preserving record linkage (PPRL) aims at integrating person-related data without
revealing sensitive information. For this purpose, PPRL schemes typically use encoded …