Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
Optimal-time text indexing in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
Pan-genome storage and analysis techniques
Computational pan-genome analysis has emerged from the rapid increase of available
genome sequencing data. Starting from a microbial pan-genome, the concept has spread to …
genome sequencing data. Starting from a microbial pan-genome, the concept has spread to …
Alphabet-independent compressed text indexing
Self-indexes are able to represent a text asymptotically within the information-theoretic lower
bound under the k th order entropy model and offer access to any text substring and indexed …
bound under the k th order entropy model and offer access to any text substring and indexed …
Computing MEMs and Relatives on Repetitive Text Collections
G Navarro - arxiv preprint arxiv:2210.09914, 2022 - arxiv.org
We consider the problem of computing the Maximal Exact Matches (MEMs) of a given
pattern $ P [1.. m] $ on a large repetitive text collection $ T [1.. n] $, which is represented as a …
pattern $ P [1.. m] $ on a large repetitive text collection $ T [1.. n] $, which is represented as a …
[PDF][PDF] Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems.
I Müller, C Ratsch, F Faerber - EDBT, 2014 - openproceedings.org
Domain encoding is a common technique to compress the columns of a column store and to
accelerate many types of queries at the same time. It is based on the assumption that most …
accelerate many types of queries at the same time. It is based on the assumption that most …
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts
of data and innovative bioinformatics approaches, designed to cope with this data flood, has …
of data and innovative bioinformatics approaches, designed to cope with this data flood, has …
Versatile succinct representations of the bidirectional burrows-wheeler transform
We describe succinct and compact representations of the bidirectional bwt of a string s∈ Σ*
which provide increasing navigation power and a number of space-time tradeoffs. One such …
which provide increasing navigation power and a number of space-time tradeoffs. One such …
PFP Compressed Suffix Trees∗
Prefix-free parsing (PFP) was introduced by Boucher et al.(2019) as a preprocessing step to
ease the computation of Burrows-Wheeler Transforms (BWTs) of genomic databases. Given …
ease the computation of Burrows-Wheeler Transforms (BWTs) of genomic databases. Given …
Wheeler maps
Motivated by challenges in pangenomic read alignment, we propose a generalization of
Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text T [1.. n] and an …
Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text T [1.. n] and an …