Indexing highly repetitive string collections, part II: compressed indexes

G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

Fully functional suffix trees and optimal text searching in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Journal of the ACM (JACM), 2020 - dl.acm.org
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

Optimal-time text indexing in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Proceedings of the Twenty-Ninth Annual ACM …, 2018 - SIAM
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

Text indexing for long patterns: Anchors are all you need

L Ayad, G Loukidis, S Pissis - Proceedings of the VLDB Endowment …, 2023 - kclpure.kcl.ac.uk
In many real-world database systems, a large fraction of the data is represented by strings:
sequences of letters over some alphabet. This is because strings can easily encode data …

Wheeler maps

A Baláž, T Gagie, A Goga, S Heumos… - Latin American …, 2024 - Springer
Motivated by challenges in pangenomic read alignment, we propose a generalization of
Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text T [1.. n] and an …

Gapped indexing for consecutive occurrences

P Bille, IL Gørtz, MR Pedersen, TA Steiner - Algorithmica, 2023 - Springer
The classic string indexing problem is to preprocess a string S into a compact data structure
that supports efficient pattern matching queries. Typical queries include existential queries …

[HTML][HTML] Indexing weighted sequences: Neat and efficient

C Barton, T Kociumaka, C Liu, SP Pissis… - Information and …, 2020 - Elsevier
A weighted sequence is a sequence of probability mass functions over a finite alphabet. A
weighted index is a data structure constructed for a weighted sequence and a threshold 1 z …

Breaking the 𝒪(n)-Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees

D Kempa, T Kociumaka - Proceedings of the 2023 Annual ACM-SIAM …, 2023 - SIAM
The suffix array, describing the lexicographical order of suffixes of a given text, and the suffix
tree, a path-compressed trie of all suffixes, are the two most fundamental data structures for …

Indexing highly repetitive string collections

G Navarro - arxiv preprint arxiv:2004.02781, 2020 - arxiv.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

Gapped string indexing in subquadratic space and sublinear query time

P Bille, IL Gørtz, M Lewenstein, SP Pissis… - arxiv preprint arxiv …, 2022 - arxiv.org
In Gapped String Indexing, the goal is to compactly represent a string $ S $ of length $ n $
such that for any query consisting of two strings $ P_1 $ and $ P_2 $, called patterns, and an …