Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
String similarity search and join: a survey
String similarity search and join are two important operations in data cleaning and
integration, which extend traditional exact search and exact join operations in databases by …
integration, which extend traditional exact search and exact join operations in databases by …
Hierarchical classification of protein folds using a novel ensemble classifier
The analysis of biological information from protein sequences is important for the study of
cellular functions and interactions, and protein fold recognition plays a key role in the …
cellular functions and interactions, and protein fold recognition plays a key role in the …
Massjoin: A mapreduce-based method for scalable string similarity joins
String similarity join is an essential operation in data integration. The era of big data calls for
scalable algorithms to support large-scale string similarity joins. In this paper, we study …
scalable algorithms to support large-scale string similarity joins. In this paper, we study …
Human-in-the-loop data integration
G Li - Proceedings of the VLDB Endowment, 2017 - dl.acm.org
Data integration aims to integrate data in different sources and provide users with a unified
view. However, data integration cannot be completely addressed by purely automated …
view. However, data integration cannot be completely addressed by purely automated …
Efficient approximate entity matching using jaro-winkler distance
Jaro-Winkler distance is a measurement to measure the similarity between two strings.
Since Jaro-Winkler distance performs well in matching personal and entity names, it is …
Since Jaro-Winkler distance performs well in matching personal and entity names, it is …
Discovering similarity inclusion dependencies
Inclusion dependencies (INDs) are a well-known type of data dependency, specifying that
the values of one column are contained in those of another column. INDs can be used for …
the values of one column are contained in those of another column. INDs can be used for …
Efficient graph similarity search over large graph databases
Since many graph data are often noisy and incomplete in real applications, it has become
increasingly important to retrieve graphs in the graph database that approximately match the …
increasingly important to retrieve graphs in the graph database that approximately match the …
Top-k similarity join in heterogeneous information networks
As a newly emerging network model, heterogeneous information networks (HINs) have
received growing attention. Many data mining tasks have been explored in HINs, including …
received growing attention. Many data mining tasks have been explored in HINs, including …
Fast subtrajectory similarity search in road networks under weighted edit distance constraints
In this paper, we address a similarity search problem for spatial trajectories in road networks.
In particular, we focus on the subtrajectory similarity search problem, which involves finding …
In particular, we focus on the subtrajectory similarity search problem, which involves finding …
A pivotal prefix based filtering algorithm for string similarity search
We study the string similarity search problem with edit-distance constraints, which, given a
set of data strings and a query string, finds the similar strings to the query. Existing …
set of data strings and a query string, finds the similar strings to the query. Existing …