Fully functional suffix trees and optimal text searching in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Journal of the ACM (JACM), 2020 - dl.acm.org
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

Optimal-time text indexing in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Proceedings of the Twenty-Ninth Annual ACM …, 2018 - SIAM
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

Pan-genome storage and analysis techniques

T Zekic, G Holley, J Stoye - Comparative Genomics: Methods and …, 2017 - Springer
Computational pan-genome analysis has emerged from the rapid increase of available
genome sequencing data. Starting from a microbial pan-genome, the concept has spread to …

Alphabet-independent compressed text indexing

D Belazzougui, G Navarro - ACM Transactions on Algorithms (TALG), 2014 - dl.acm.org
Self-indexes are able to represent a text asymptotically within the information-theoretic lower
bound under the k th order entropy model and offer access to any text substring and indexed …

Computing MEMs and Relatives on Repetitive Text Collections

G Navarro - arxiv preprint arxiv:2210.09914, 2022 - arxiv.org
We consider the problem of computing the Maximal Exact Matches (MEMs) of a given
pattern $ P [1.. m] $ on a large repetitive text collection $ T [1.. n] $, which is represented as a …

[PDF][PDF] Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems.

I Müller, C Ratsch, F Faerber - EDBT, 2014 - openproceedings.org
Domain encoding is a common technique to compress the columns of a column store and to
accelerate many types of queries at the same time. It is based on the assumption that most …

Prospects and limitations of full-text index structures in genome analysis

M Vyverman, B De Baets, V Fack… - Nucleic acids …, 2012 - academic.oup.com
The combination of incessant advances in sequencing technology producing large amounts
of data and innovative bioinformatics approaches, designed to cope with this data flood, has …

Versatile succinct representations of the bidirectional burrows-wheeler transform

D Belazzougui, F Cunial, J Kärkkäinen… - Algorithms–ESA 2013 …, 2013 - Springer
We describe succinct and compact representations of the bidirectional bwt of a string s∈ Σ*
which provide increasing navigation power and a number of space-time tradeoffs. One such …

PFP Compressed Suffix Trees∗

C Boucher, O Cvacho, T Gagie, J Holub… - 2021 Proceedings of the …, 2021 - SIAM
Prefix-free parsing (PFP) was introduced by Boucher et al.(2019) as a preprocessing step to
ease the computation of Burrows-Wheeler Transforms (BWTs) of genomic databases. Given …

Wheeler maps

A Baláž, T Gagie, A Goga, S Heumos… - Latin American …, 2024 - Springer
Motivated by challenges in pangenomic read alignment, we propose a generalization of
Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text T [1.. n] and an …