A survey of BWT variants for string collections
D Cenzato, Z Lipták - Bioinformatics, 2024 - academic.oup.com
Motivation In recent years, the focus of bioinformatics research has moved from individual
sequences to collections of sequences. Given the fundamental role of the Burrows-Wheeler …
sequences to collections of sequences. Given the fundamental role of the Burrows-Wheeler …
Toward a definitive compressibility measure for repetitive sequences
While the th order empirical entropy is an accepted measure of the compressibility of
individual sequences on classical text collections, it is useful only for small values of and …
individual sequences on classical text collections, it is useful only for small values of and …
A survey of BWT variants for string collections
D Cenzato, Z Lipták - arxiv preprint arxiv:2202.13235, 2022 - arxiv.org
In recent years, the focus of bioinformatics research has moved from individual sequences to
collections of sequences. Given the fundamental role of the Burrows-Wheeler Transform …
collections of sequences. Given the fundamental role of the Burrows-Wheeler Transform …
Bit catastrophes for the burrows-wheeler transform
A bit catastrophe, loosely defined, is when a change in just one character of a string causes
a significant change in the size of the compressed string. We study this phenomenon for the …
a significant change in the size of the compressed string. We study this phenomenon for the …
[HTML][HTML] On the number of equal-letter runs of the bijective Burrows-Wheeler transform
Abstract The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT
[Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is …
[Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is …
Iterated straight-line programs
We explore an extension to straight-line programs (SLPs) that outperforms, for some text
families, the measure δ based on substring complexity, a lower bound for most measures …
families, the measure δ based on substring complexity, a lower bound for most measures …
On the impact of morphisms on BWT-Runs
Morphisms are widely studied combinatorial objects that can be used for generating infinite
families of words. In the context of Information theory, injective morphisms are called …
families of words. In the context of Information theory, injective morphisms are called …
Bijective BWT based compression schemes
We investigate properties of the bijective Burrows-Wheeler transform (BBWT). We show that
for any string w, a bidirectional macro scheme of size O (r B) can be induced from the BBWT …
for any string w, a bidirectional macro scheme of size O (r B) can be induced from the BBWT …
Computing np-hard repetitiveness measures via MAX-SAT
Repetitiveness measures reveal profound characteristics of datasets, and give rise to
compressed data structures and algorithms working in compressed space. Alas, the …
compressed data structures and algorithms working in compressed space. Alas, the …
Maintaining the Size of LZ77 on Semi-Dynamic Strings
We consider the problem of maintaining the size of the LZ77 factorization of a string S of
length at most n under the following operations:(a) appending a given letter to S and (b) …
length at most n under the following operations:(a) appending a given letter to S and (b) …