Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Indexing highly repetitive string collections, part II: compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Compressed full-text indexes
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …
of these indexes has traditionally been their space consumption. A recent trend is to develop …
At the roots of dictionary compression: string attractors
A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …
weak model when the input contains long repetitions. Motivated by this fact, decades of …
Resolution of the burrows-wheeler transform conjecture
D Kempa, T Kociumaka - Communications of the ACM, 2022 - dl.acm.org
Abstract The Burrows-Wheeler Transform (BWT) is an invertible text transformation that
permutes symbols of a text according to the lexicographical order of its suffixes. BWT is the …
permutes symbols of a text according to the lexicographical order of its suffixes. BWT is the …
On compressing and indexing repetitive sequences
S Kreft, G Navarro - Theoretical Computer Science, 2013 - Elsevier
We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which
achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary …
achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary …
Alphabet-independent compressed text indexing
D Belazzougui, G Navarro - ACM Transactions on Algorithms (TALG), 2014 - dl.acm.org
Self-indexes are able to represent a text asymptotically within the information-theoretic lower
bound under the k th order entropy model and offer access to any text substring and indexed …
bound under the k th order entropy model and offer access to any text substring and indexed …
Toward a definitive compressibility measure for repetitive sequences
T Kociumaka, G Navarro… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
While the th order empirical entropy is an accepted measure of the compressibility of
individual sequences on classical text collections, it is useful only for small values of and …
individual sequences on classical text collections, it is useful only for small values of and …
[HTML][HTML] Universal compressed text indexing
The rise of repetitive datasets has lately generated a lot of interest in compressed self-
indexes based on dictionary compression, a rich and heterogeneous family of techniques …
indexes based on dictionary compression, a rich and heterogeneous family of techniques …
Optimal lower and upper bounds for representing sequences
D Belazzougui, G Navarro - ACM Transactions on Algorithms (TALG), 2015 - dl.acm.org
Sequence representations supporting the queries access, select, and rank are at the core of
many data structures. There is a considerable gap between the various upper bounds and …
many data structures. There is a considerable gap between the various upper bounds and …
LZ77 computation based on the run-length encoded BWT
A Policriti, N Prezza - Algorithmica, 2018 - Springer
Computing the LZ77 factorization is a fundamental task in text compression and indexing,
being the size z of this compressed representation closely related to the self-repetitiveness …
being the size z of this compressed representation closely related to the self-repetitiveness …