- Academic Search

G Papadakis, D Skoutas, E Thanos… - ACM Computing Surveys …, 2020 - dl.acm.org

Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …

保存引用被引用次数：234 相关文章所有 7 个版本

[Free GPT-4]

[PDF] rutgers.edu

String similarity search and join: a survey

M Yu, G Li, D Deng, J Feng - Frontiers of Computer Science, 2016 - Springer

String similarity search and join are two important operations in data cleaning and
integration, which extend traditional exact search and exact join operations in databases by …

保存引用被引用次数：179 相关文章所有 17 个版本

[Free GPT-4]

[PDF] sjtu.edu.cn

[PDF][PDF] 大数据的-个重要方面数据可用性

**建中，刘显敏 - 计算机研究与发展, 2013 - cs.sjtu.edu.cn

摘要!"# $% &'()*+,-.# $/0 123 4567893:;% &'<=>?@ ABCDEF GFHI# $8 J'KLMN
OPQRSTU@'VWIABXYZ [\],@ AB'KLVW^ _I!" AB'aZbc deABQ!^ fS ABXYZghiKjk l# $8 J …

保存引用被引用次数：168 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] psu.edu

Fuzzy keyword search over encrypted data in cloud computing

J Li, Q Wang, C Wang, N Cao, K Ren… - 2010 Proceedings IEEE …, 2010 - ieeexplore.ieee.org

As Cloud Computing becomes prevalent, more and more sensitive information are being
centralized into the cloud. For the protection of data privacy, sensitive data usually have to …

保存引用被引用次数：1407 相关文章所有 15 个版本

[Free GPT-4]

[PDF] acm.org

Josie: Overlap set similarity search for finding joinable tables in data lakes

E Zhu, D Deng, F Nargesian, RJ Miller - Proceedings of the 2019 …, 2019 - dl.acm.org

We present a new solution for finding joinable tables in massive data lakes: given a table
and one join column, find tables that can be joined with the given table on the largest …

保存引用被引用次数：224 相关文章所有 7 个版本

[Free GPT-4]

[PDF] inria.fr

Efficient similarity joins for near-duplicate detection

C **ao, W Wang, X Lin, JX Yu, G Wang - ACM Transactions on Database …, 2011 - dl.acm.org

With the increasing amount of data and the need to integrate data from multiple data
sources, one of the challenging issues is to identify near-duplicate records efficiently. In this …

保存引用被引用次数：941 相关文章所有 17 个版本

[Free GPT-4]

[PDF] arxiv.org

Semantics-aware dataset discovery from data lakes with contextualized column-based representation learning

G Fan, J Wang, Y Li, D Zhang, R Miller - arxiv preprint arxiv:2210.01922, 2022 - arxiv.org

Dataset discovery from data lakes is essential in many real application scenarios. In this
paper, we propose Starmie, an end-to-end framework for dataset discovery from data lakes …

保存引用被引用次数：69 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] tsinghua.edu.cn

Can we beat the prefix filtering? An adaptive framework for similarity join and search

J Wang, G Li, J Feng - Proceedings of the 2012 ACM SIGMOD …, 2012 - dl.acm.org

As two important operations in data cleaning, similarity join and similarity search have
attracted much attention recently. Existing methods to support similarity join usually adopt a …

保存引用被引用次数：298 相关文章所有 10 个版本

[Free GPT-4]

[PDF] arxiv.org

V-smart-join: A scalable mapreduce framework for all-pair similarity joins of multisets and vectors

A Metwally, C Faloutsos - arxiv preprint arxiv:1204.6077, 2012 - arxiv.org

This work proposes V-SMART-Join, a scalable MapReduce-based framework for
discovering all pairs of similar entities. The V-SMART-Join framework is applicable to sets …

保存引用被引用次数：260 相关文章所有 12 个版本 HTML 版

Deep entity matching: Challenges and opportunities

Y Li, J Li, Y Suhara, J Wang, W Hirota… - Journal of Data and …, 2021 - dl.acm.org

Entity matching refers to the task of determining whether two different representations refer
to the same real-world entity. It continues to be a prevalent problem for many organizations …

保存引用被引用次数：80 相关文章

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Efficient merging and filtering algorithms for approximate string searches

Blocking and filtering techniques for entity resolution: A survey

String similarity search and join: a survey

[PDF][PDF] 大数据的-个重要方面数据可用性

Fuzzy keyword search over encrypted data in cloud computing

Josie: Overlap set similarity search for finding joinable tables in data lakes

Efficient similarity joins for near-duplicate detection

Semantics-aware dataset discovery from data lakes with contextualized column-based representation learning

Can we beat the prefix filtering? An adaptive framework for similarity join and search

V-smart-join: A scalable mapreduce framework for all-pair similarity joins of multisets and vectors

Deep entity matching: Challenges and opportunities