Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Indexing highly repetitive string collections, part II: Compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
At the roots of dictionary compression: string attractors
A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …
weak model when the input contains long repetitions. Motivated by this fact, decades of …
Collapsing the hierarchy of compressed data structures: Suffix arrays in optimal compressed space
The last two decades have witnessed a dramatic increase in the amount of highly repetitive
datasets consisting of sequential data (strings, texts). Processing these massive amounts of …
datasets consisting of sequential data (strings, texts). Processing these massive amounts of …
Optimal-time text indexing in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
String synchronizing sets: sublinear-time BWT construction and optimal LCE data structure
Burrows–Wheeler transform (BWT) is an invertible text transformation that, given a text T of
length n, permutes its symbols according to the lexicographic order of suffixes of T. BWT is …
length n, permutes its symbols according to the lexicographic order of suffixes of T. BWT is …
Near-optimal quantum algorithms for bounded edit distance and lempel-ziv factorization
Measuring sequence similarity and compressing texts are among the most fundamental
tasks in string algorithms. In this work, we develop near-optimal quantum algorithms for the …
tasks in string algorithms. In this work, we develop near-optimal quantum algorithms for the …
External memory BWT and LCP computation for sequence collections with applications
Background Sequencing technologies produce larger and larger collections of
biosequences that have to be stored in compressed indices supporting fast search …
biosequences that have to be stored in compressed indices supporting fast search …
Text indexing for long patterns: Anchors are all you need
In many real-world database systems, a large fraction of the data is represented by strings:
sequences of letters over some alphabet. This is because strings can easily encode data …
sequences of letters over some alphabet. This is because strings can easily encode data …
On the complexity of BWT-runs minimization via alphabet reordering
The Burrows-Wheeler Transform (BWT) has been an essential tool in text compression and
indexing. First introduced in 1994, it went on to provide the backbone for the first encoding of …
indexing. First introduced in 1994, it went on to provide the backbone for the first encoding of …