Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Efficient query evaluation on probabilistic databases
N Dalvi, D Suciu - The VLDB Journal, 2007 - Springer
We describe a framework for supporting arbitrarily complex SQL queries with “uncertain”
predicates. The query semantics is based on a probabilistic model and the results are …
predicates. The query semantics is based on a probabilistic model and the results are …
Efficient similarity joins for near-duplicate detection
With the increasing amount of data and the need to integrate data from multiple data
sources, one of the challenging issues is to identify near-duplicate records efficiently. In this …
sources, one of the challenging issues is to identify near-duplicate records efficiently. In this …
Efficient similarity search and classification via rank aggregation
We propose a novel approach to performing efficient similarity search and classification in
high dimensional data. In this framework, the database elements are vectors in a Euclidean …
high dimensional data. In this framework, the database elements are vectors in a Euclidean …
Trio: A system for integrated management of data, accuracy, and lineage
J Widom - 2004 - ilpubs.stanford.edu
Trio is a new database system that manages not only data, but also the accuracy and
lineage of the data. Approximate (uncertain, probabilistic, incomplete, fuzzy, and imprecise!) …
lineage of the data. Approximate (uncertain, probabilistic, incomplete, fuzzy, and imprecise!) …
ULDBs: Databases with uncertainty and lineage
This paper introduces\uldb s, an extension of relational databases with simple yet
expressive constructs for representing and manipulating both {\em lineage} and {\em …
expressive constructs for representing and manipulating both {\em lineage} and {\em …
[PDF][PDF] Web-scale data integration: You can only afford to pay as you go
J Madhavan, SR Jeffery, S Cohen, X Dong… - Proceedings of …, 2007 - datascienceassn.org
ABSTRACT The World Wide Web is witnessing an increase in the amount of structured
content–vast heterogeneous collections of structured data are on the rise due to the Deep …
content–vast heterogeneous collections of structured data are on the rise due to the Deep …
Working models for uncertain data
This paper explores an inherent tension in modeling and querying uncertain data: simple,
intuitive representations of uncertain data capture many application requirements, but these …
intuitive representations of uncertain data capture many application requirements, but these …
Top-k set similarity joins
Similarity join is a useful primitive operation underlying many applications, such as near
duplicate Web page detection, data integration, and pattern recognition. Traditional similarity …
duplicate Web page detection, data integration, and pattern recognition. Traditional similarity …
[PDF][PDF] Top-k query evaluation with probabilistic guarantees
Top-k queries based on ranking elements of multidimensional datasets are a fundamental
building block for many kinds of information discovery. The best known general-purpose …
building block for many kinds of information discovery. The best known general-purpose …
[PDF][PDF] Klee: A framework for distributed top-k query algorithms
This paper addresses the efficient processing of top-k queries in wide-area distributed data
repositories where the index lists for the attribute values (or text terms) of a query are …
repositories where the index lists for the attribute values (or text terms) of a query are …