- Academic Search

PD Turney, P Pantel - Journal of artificial intelligence research, 2010 - jair.org

Computers understand very little of the meaning of human language. This profoundly limits
our ability to give instructions to computers, the ability of computers to explain their actions to …

Save Cite Cited by 4042 Related articles All 18 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] tuni.fi

An overview of end-to-end entity resolution for big data

V Christophides, V Efthymiou, T Palpanas… - ACM Computing …, 2020 - dl.acm.org

One of the most critical tasks for improving data quality and increasing the reliability of data
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …

Save Cite Cited by 229 Related articles All 11 versions Free GPT-4

[Free GPT-4]

[PDF] unipi.it

Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets

CCM Yeh, Y Zhu, L Ulanova, N Begum… - 2016 IEEE 16th …, 2016 - ieeexplore.ieee.org

The all-pairs-similarity-search (or similarity join) problem has been extensively studied for
text and a handful of other datatypes. However, surprisingly little progress has been made …

Save Cite Cited by 879 Related articles All 14 versions Free GPT-4

[Free GPT-4]

[PDF] psu.edu

[BOOK][B] The data matching process

P Christen, P Christen - 2012 - Springer

This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …

Save Cite Cited by 1665 Related articles All 13 versions Free GPT-4 Library Search

[BOOK][B] Data cleaning

IF Ilyas, X Chu - 2019 - books.google.com

This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …

Save Cite Cited by 359 Related articles All 5 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] psu.edu

Efficient k-nearest neighbor graph construction for generic similarity measures

W Dong, C Moses, K Li - … of the 20th international conference on World …, 2011 - dl.acm.org

K-Nearest Neighbor Graph (K-NNG) construction is an important operation with many web
related applications, including collaborative filtering, similarity search, and many others in …

Save Cite Cited by 851 Related articles All 11 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS)

A Shrivastava, P Li - Advances in neural information …, 2014 - proceedings.neurips.cc

We present the first provably sublinear time hashing algorithm for approximate\emph
{Maximum Inner Product Search}(MIPS). Searching with (un-normalized) inner product as …

Save Cite Cited by 597 Related articles All 18 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Crowder: Crowdsourcing entity resolution

J Wang, T Kraska, MJ Franklin, J Feng - arxiv preprint arxiv:1208.1927, 2012 - arxiv.org

Entity resolution is central to data integration and data cleaning. Algorithmic approaches
have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a …

Save Cite Cited by 758 Related articles All 21 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] inria.fr

Efficient similarity joins for near-duplicate detection

C **ao, W Wang, X Lin, JX Yu, G Wang - ACM Transactions on Database …, 2011 - dl.acm.org

With the increasing amount of data and the need to integrate data from multiple data
sources, one of the challenging issues is to identify near-duplicate records efficiently. In this …

Save Cite Cited by 943 Related articles All 17 versions Free GPT-4

[Free GPT-4]

[PDF] kuleuven.be

Blocking and filtering techniques for entity resolution: A survey

G Papadakis, D Skoutas, E Thanos… - ACM Computing Surveys …, 2020 - dl.acm.org

Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …

Save Cite Cited by 234 Related articles All 7 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Scaling up all pairs similarity search

From frequency to meaning: Vector space models of semantics

An overview of end-to-end entity resolution for big data

Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets

[BOOK][B] The data matching process

[BOOK][B] Data cleaning

Efficient k-nearest neighbor graph construction for generic similarity measures

Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS)

Crowder: Crowdsourcing entity resolution

Efficient similarity joins for near-duplicate detection

Blocking and filtering techniques for entity resolution: A survey