An overview of end-to-end entity resolution for big data

V Christophides, V Efthymiou, T Palpanas… - ACM Computing …, 2020 - dl.acm.org
One of the most critical tasks for improving data quality and increasing the reliability of data
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …

Blocking and filtering techniques for entity resolution: A survey

G Papadakis, D Skoutas, E Thanos… - ACM Computing Surveys …, 2020 - dl.acm.org
Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

A new robust approach for reversible database watermarking with distortion control

D Hu, D Zhao, S Zheng - IEEE Transactions on Knowledge and …, 2018 - ieeexplore.ieee.org
Nowadays information is crucial in many fields such as medicine, science and business,
where databases are used effectively for information sharing. However, the databases face …

End-to-end entity resolution for big data: A survey

V Christophides, V Efthymiou, T Palpanas… - arxiv preprint arxiv …, 2019 - arxiv.org
One of the most important tasks for improving data quality and the reliability of data analytics
results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the …

Schema-agnostic progressive entity resolution

G Simonini, G Papadakis, T Palpanas… - … on Knowledge and …, 2018 - ieeexplore.ieee.org
Entity Resolution (ER) is the task of finding entity profiles that correspond to the same real-
world entity. Progressive ER aims to efficiently resolve large datasets when limited time …

On the accuracy and scalability of probabilistic data linkage over the Brazilian 114 million cohort

R Pita, C Pinto, S Sena, R Fiaccone… - IEEE journal of …, 2018 - ieeexplore.ieee.org
Data linkage refers to the process of identifying and linking records that refer to the same
entity across multiple heterogeneous data sources. This method has been widely utilized …

Sparker: Scaling entity resolution in spark

L Gagliardelli, G Simonini, D Beneventano… - Advances in Database …, 2019 - iris.unimore.it
We present SparkER, an ER tool that can scale practitioners' favorite ER algorithms.
SparkER has been devised to take full ad-vantage of parallel and distributed computation as …

Scaling entity resolution: A loosely schema-aware approach

G Simonini, L Gagliardelli, S Bergamaschi… - Information Systems, 2019 - Elsevier
In big data sources, real-world entities are typically represented with a variety of schemata
and formats (eg, relational records, JSON objects, etc.). Different profiles (ie …

A survey of blocking and filtering techniques for entity resolution

G Papadakis, D Skoutas, E Thanos… - arxiv preprint arxiv …, 2019 - arxiv.org
Efficiency techniques are an integral part of Entity Resolution, since its infancy. In this
survey, we organized the bulk of works in the field into Blocking, Filtering and hybrid …