At the roots of dictionary compression: string attractors

D Kempa, N Prezza - Proceedings of the 50th Annual ACM SIGACT …, 2018 - dl.acm.org
A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …

The “runs” theorem

H Bannai, TI, S Inenaga, Y Nakashima, M Takeda… - SIAM Journal on …, 2017 - SIAM
We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon
words. The characterization leads to a proof of what was known as the “runs” conjecture RM …

Internal pattern matching queries in a text and applications

T Kociumaka, J Radoszewski, W Rytter, T Waleń - SIAM Journal on …, 2024 - SIAM
We consider several types of internal queries, that is, questions about fragments of a given
text specified in constant space by their locations in. Our main result is an optimal data …

A comparison of index-based Lempel-Ziv LZ77 factorization algorithms

A Al-Hafeedh, M Crochemore, L Ilie… - ACM Computing …, 2012 - dl.acm.org
Since 1977, when Lempel and Ziv described a kind of string factorization useful for text
compression, there has been a succession of algorithms proposed for computing “LZ …

[PDF][PDF] Efficient data structures for internal queries in texts

T Kociumaka - PhD thesis, University of Warsaw, 2018 - repozytorium.uw.edu.pl
This thesis is devoted to internal queries in texts, which ask to solve classic text-processing
problems for substrings of a given text. More precisely, the task is to preprocess a static …

LZ77 computation based on the run-length encoded BWT

A Policriti, N Prezza - Algorithmica, 2018 - Springer
Computing the LZ77 factorization is a fundamental task in text compression and indexing,
being the size z of this compressed representation closely related to the self-repetitiveness …

Linear time Lempel-Ziv factorization: Simple, fast, small

J Kärkkäinen, D Kempa, SJ Puglisi - … Bad Herrenalb, Germany, June 17-19 …, 2013 - Springer
Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in
many diverse applications, including data compression, text indexing, and pattern discovery …

A simple algorithm for computing the Lempel Ziv factorization

M Crochemore, L Ilie, WF Smyth - … Conference (DCC 2008), 2008 - ieeexplore.ieee.org
A Simple Algorithm for Computing the Lempel–Ziv Factorization Page 1 A simple algorithm for
computing the Lempel–Ziv factorization Maxime Crochemore1, 2, ∗ Lucian Ilie3, †, ‡ WF …

Sublinear time Lempel-Ziv (LZ77) factorization

J Ellert - International Symposium on String Processing and …, 2023 - Springer
Abstract The Lempel-Ziv (LZ77) factorization of a string is a widely-used algorithmic tool that
plays a central role in data compression and indexing. For a length-n string over integer …

Repetitions in strings: Algorithms and combinatorics

M Crochemore, L Ilie, W Rytter - Theoretical Computer Science, 2009 - Elsevier
The article is an overview of basic issues related to repetitions in strings, concentrating on
algorithmic and combinatorial aspects. This area is important both from theoretical and …