Privacy-preserving record linkage for big data: Current approaches and research challenges

D Vatsalan, Z Sehili, P Christen, E Rahm - Handbook of big data …, 2017 - Springer
Abstract The growth of Big Data, especially personal data dispersed in multiple data
sources, presents enormous opportunities and insights for businesses to explore and …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

Unsupervised graph-based entity resolution for complex entities

N Kirielle, P Christen, T Ranbaduge - ACM Transactions on Knowledge …, 2023 - dl.acm.org
Entity resolution (ER) is the process of linking records that refer to the same entity.
Traditionally, this process compares attribute values of records to calculate similarities and …

Cheat detection through temporal inference of constrained orders for subsequences

J Rogers, R Aygun, L Etzkorn - 2022 IEEE Fifth International …, 2022 - ieeexplore.ieee.org
For select domains and datasets, duplicates may be, in part or in whole, instances of
cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that …

Noise-tolerant approximate blocking for dynamic real-time entity resolution

H Liang, Y Wang, P Christen, R Gayler - … 2014, Tainan, Taiwan, May 13-16 …, 2014 - Springer
Entity resolution is the process of identifying records in one or multiple data sources that
represent the same real-world entity. This process needs to deal with noisy data that contain …

Structured object matching across web page revisions

T Bleifuß, L Bornemann, DV Kalashnikov… - 2021 IEEE 37th …, 2021 - ieeexplore.ieee.org
A considerable amount of useful information on the web is (semi-) structured, such as tables
and lists. An extensive corpus of prior work addresses the problem of making these human …

Improving temporal record linkage using regression classification

Y Hu, Q Wang, D Vatsalan, P Christen - … and Data Mining: 21st Pacific-Asia …, 2017 - Springer
Temporal record linkage is the process of identifying groups of records that are collected
over a period of time, such as in census or voter registration databases, where records in the …

Bayesian analysis of state voter registration database integrity

J Cao, SS Kim, RM Alvarez - Statistics, Politics and Policy, 2022 - degruyter.com
How do we ensure a statewide voter registration database's accuracy and integrity,
especially when the database depends on aggregating decentralized, sub-state data with …

Privacy-preserving temporal record linkage

T Ranbaduge, P Christen - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Record linkage (RL) is the process of identifying matching records from different databases
that refer to the same entity. It is common that the attribute values of records that belong to …

[PDF][PDF] Preparation of a real temporal voter data set for record linkage and duplicate detection research

P Christen - Te chnical Rep ort. The Australian National University, 2014 - cs.anu.edu.au
This report describes the process involved in accessing, processing, and merging a set of
files containing voter registration information from the US state of North Carolina (NC). We …