An overview of end-to-end entity resolution for big data
One of the most critical tasks for improving data quality and increasing the reliability of data
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …
Blocking and filtering techniques for entity resolution: A survey
Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …
Sourcerercc: Scaling code clone detection to big-code
Despite a decade of active research, there has been a marked lack in clone detection
techniques that scale to large repositories for detecting near-miss clones. In this paper, we …
techniques that scale to large repositories for detecting near-miss clones. In this paper, we …
Constructing an interactive natural language interface for relational databases
Natural language has been the holy grail of query interface designers, but has generally
been considered too hard to work with, except in limited specific circumstances. In this …
been considered too hard to work with, except in limited specific circumstances. In this …
Efficient k-nearest neighbor graph construction for generic similarity measures
K-Nearest Neighbor Graph (K-NNG) construction is an important operation with many web
related applications, including collaborative filtering, similarity search, and many others in …
related applications, including collaborative filtering, similarity search, and many others in …
Crowder: Crowdsourcing entity resolution
Entity resolution is central to data integration and data cleaning. Algorithmic approaches
have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a …
have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a …
Datatone: Managing ambiguity in natural language interfaces for data visualization
Answering questions with data is a difficult and time-consuming process. Visual dashboards
and templates make it easy to get started, but asking more sophisticated questions often …
and templates make it easy to get started, but asking more sophisticated questions often …
Josie: Overlap set similarity search for finding joinable tables in data lakes
We present a new solution for finding joinable tables in massive data lakes: given a table
and one join column, find tables that can be joined with the given table on the largest …
and one join column, find tables that can be joined with the given table on the largest …
Practical non-interactive searchable encryption with forward and backward privacy
Abstract In Dynamic Symmetric Searchable Encryption (DSSE), forward privacy ensures that
previous search queries cannot be associated with future updates, while backward privacy …
previous search queries cannot be associated with future updates, while backward privacy …
Evaluation of entity resolution approaches on real-world match problems
Despite the huge amount of recent research efforts on entity resolution (matching) there has
not yet been a comparative evaluation on the relative effectiveness and efficiency of …
not yet been a comparative evaluation on the relative effectiveness and efficiency of …