Blocking and filtering techniques for entity resolution: A survey

G Papadakis, D Skoutas, E Thanos… - ACM Computing Surveys …, 2020 - dl.acm.org
Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …

Security and privacy aspects in MapReduce on clouds: A survey

P Derbeko, S Dolev, E Gudes, S Sharma - Computer science review, 2016 - Elsevier
MapReduce is a programming system for distributed processing of large-scale data in an
efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is …

[LIBRO][B] Entity resolution in the web of data

In recent years, several knowledge bases have been built to enable large-scale knowledge
sharing, but also an entity-centric Web search, mixing both structured data and text …

A survey on geographically distributed big-data processing using MapReduce

S Dolev, P Florissi, E Gudes… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Hadoop and Spark are widely used distributed processing frameworks for large-scale data
processing in an efficient and fault-tolerant manner on private or public clouds. These big …

Keys for graphs

W Fan, Z Fan, C Tian, XL Dong - Proceedings of the VLDB …, 2015 - research.ed.ac.uk
Keys for graphs aim to uniquely identify entities represented by vertices in a graph. We
propose a class of keys that are recursively defined in terms of graph patterns, and are …

Semantic-aware blocking for entity resolution

Q Wang, M Cui, H Liang - IEEE Transactions on Knowledge …, 2015 - ieeexplore.ieee.org
In this paper, we propose a semantic-aware blocking framework for entity resolution (ER).
The proposed framework is built using locality-sensitive hashing (LSH) techniques, which …

A survey of blocking and filtering techniques for entity resolution

G Papadakis, D Skoutas, E Thanos… - arxiv preprint arxiv …, 2019 - arxiv.org
Efficiency techniques are an integral part of Entity Resolution, since its infancy. In this
survey, we organized the bulk of works in the field into Blocking, Filtering and hybrid …

MinoanER: Schema-agnostic, non-iterative, massively parallel resolution of web entities

V Efthymiou, G Papadakis, K Stefanidis… - arxiv preprint arxiv …, 2019 - arxiv.org
Entity Resolution (ER) aims to identify different descriptions in various Knowledge Bases
(KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of …

Benchmarking blocking algorithms for web entities

V Efthymiou, K Stefanidis… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
An increasing number of entities are described by interlinked data rather than documents on
the Web. Entity Resolution (ER) aims to identify descriptions of the same real-world entity …

A Large-scale Offer Alignment Model for Partitioning Filtering and Matching Product Offers

W Huang, A Melo, JZ Pan - Proceedings of the 47th International ACM …, 2024 - dl.acm.org
Offer alignment is a key step in a product knowledge graph construction pipeline. It aims to
align retailer offers of the same product for better coverage of product details. With the rapid …