At the roots of dictionary compression: string attractors
A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …
weak model when the input contains long repetitions. Motivated by this fact, decades of …
The “runs” theorem
We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon
words. The characterization leads to a proof of what was known as the “runs” conjecture RM …
words. The characterization leads to a proof of what was known as the “runs” conjecture RM …
Internal pattern matching queries in a text and applications
We consider several types of internal queries, that is, questions about fragments of a given
text specified in constant space by their locations in. Our main result is an optimal data …
text specified in constant space by their locations in. Our main result is an optimal data …
A comparison of index-based Lempel-Ziv LZ77 factorization algorithms
Since 1977, when Lempel and Ziv described a kind of string factorization useful for text
compression, there has been a succession of algorithms proposed for computing “LZ …
compression, there has been a succession of algorithms proposed for computing “LZ …
[PDF][PDF] Efficient data structures for internal queries in texts
T Kociumaka - PhD thesis, University of Warsaw, 2018 - repozytorium.uw.edu.pl
This thesis is devoted to internal queries in texts, which ask to solve classic text-processing
problems for substrings of a given text. More precisely, the task is to preprocess a static …
problems for substrings of a given text. More precisely, the task is to preprocess a static …
LZ77 computation based on the run-length encoded BWT
Computing the LZ77 factorization is a fundamental task in text compression and indexing,
being the size z of this compressed representation closely related to the self-repetitiveness …
being the size z of this compressed representation closely related to the self-repetitiveness …
Linear time Lempel-Ziv factorization: Simple, fast, small
Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in
many diverse applications, including data compression, text indexing, and pattern discovery …
many diverse applications, including data compression, text indexing, and pattern discovery …
A simple algorithm for computing the Lempel Ziv factorization
A Simple Algorithm for Computing the Lempel–Ziv Factorization Page 1 A simple algorithm for
computing the Lempel–Ziv factorization Maxime Crochemore1, 2, ∗ Lucian Ilie3, †, ‡ WF …
computing the Lempel–Ziv factorization Maxime Crochemore1, 2, ∗ Lucian Ilie3, †, ‡ WF …
Sublinear time Lempel-Ziv (LZ77) factorization
J Ellert - International Symposium on String Processing and …, 2023 - Springer
Abstract The Lempel-Ziv (LZ77) factorization of a string is a widely-used algorithmic tool that
plays a central role in data compression and indexing. For a length-n string over integer …
plays a central role in data compression and indexing. For a length-n string over integer …
Repetitions in strings: Algorithms and combinatorics
The article is an overview of basic issues related to repetitions in strings, concentrating on
algorithmic and combinatorial aspects. This area is important both from theoretical and …
algorithmic and combinatorial aspects. This area is important both from theoretical and …