Lightweight natural language text compression

NR Brisaboa, A Farina, G Navarro, JR Paramá - Information retrieval, 2007 - Springer
Variants of Huffman codes where words are taken as the source symbols are currently the
most attractive choices to compress natural language text databases. In particular, Tagged …

On the usefulness of Fibonacci compression codes

ST Klein, MK Ben-Nissan - The Computer Journal, 2010 - academic.oup.com
Recent publications advocate the use of various variable length codes for which each
codeword consists of an integral number of bytes in compression applications using large …

(S, C)-dense coding: An optimized compression code for natural language text databases

NR Brisaboa, A Farina, G Navarro… - … Symposium on String …, 2003 - Springer
This work presents (s, c)-Dense Code, a new method for compressing natural language
texts. This technique is a generalization of a previous compression technique called End …

A scalable approach for index compression using wavelet tree and LZW

S Gupta, AK Yadav, D Yadav, B Shukla - International Journal of …, 2022 - Springer
Nowadays, people are using multimedia tools that store multiple types of data in a large
amount. The information retrieval from such a vast amount is not an easy task. So, document …

Enhanced byte codes with restricted prefix properties

JS Culpepper, A Moffat - … Symposium on String Processing and Information …, 2005 - Springer
Byte codes have a number of properties that make them attractive for practical compression
systems: they are relatively easy to construct; they decode quickly; and they can be …

Compressed representation of dynamic binary relations with applications

NR Brisaboa, A Cerdeira-Pena, G de Bernardo… - Information Systems, 2017 - Elsevier
We introduce a dynamic data structure for the compact representation of binary relations R⊆
A× B. The data structure is a dynamic variant of the k 2-tree, a static compact representation …

Variable-length prefix codes with multiple delimiters

AV Anisimov, IO Zavadskyi - IEEE Transactions on Information …, 2017 - ieeexplore.ieee.org
Let m 1, m 2,..., mt be a fixed set of natural integers given in ascending order. A multi-
delimiter code D m1,..., mt consists of t words of the form 1 mi 0 and all other binary words …

Reverse multi-delimiter compression codes

I Zavadskyi, AV Anisimov - 2020 Data Compression Conference …, 2020 - ieeexplore.ieee.org
An enhanced version of a recently introduced family of variable length binary codes with
multiple pattern delimiters is presented and discussed. These codes are complete …

Word-wise handwritten Persian and Roman script identification

K Roy, A Alaei, U Pal - 2010 12th International Conference on …, 2010 - ieeexplore.ieee.org
Most of the countries use bi-script documents. This is because every country uses its own
national language and English as second/foreign language. Therefore, bi-lingual document …

Simple compression code supporting random access and fast string matching

K Fredriksson, F Nikitin - … : 6th International Workshop, WEA 2007, Rome …, 2007 - Springer
Given a sequence S of n symbols over some alphabet Σ, we develop a new compression
method that is (i) very simple to implement;(ii) provides O (1) time random access to any …