Compressed full-text indexes

G Navarro, V Mäkinen - ACM Computing Surveys (CSUR), 2007 - dl.acm.org
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …

Compressed representations of sequences and full-text indexes

P Ferragina, G Manzini, V Mäkinen… - ACM Transactions on …, 2007 - dl.acm.org
Given a sequence S= s 1 s 2… sn of integers smaller than r= O (polylog (n)), we show how S
can be represented using nH 0 (S)+ o (n) bits, so that we can know any sq, as well as …

Succinct suffix arrays based on run-length encoding

V Mäkinen, G Navarro - … Pattern Matching: 16th Annual Symposium, CPM …, 2005 - Springer
A succinct full-text self-index is a data structure built on a text T= t 1 t 2... tn, which takes little
space (ideally close to that of the compressed text), permits efficient search for the …

DACs: Bringing direct access to variable-length codes

NR Brisaboa, S Ladra, G Navarro - Information Processing & Management, 2013 - Elsevier
We present a new variable-length encoding scheme for sequences of integers, Directly
Addressable Codes (DACs), which enables direct access to any element of the encoded …

Recognizing expressions from face and body gesture by temporal normalized motion and appearance features

S Chen, YL Tian, Q Liu, DN Metaxas - Image and Vision Computing, 2013 - Elsevier
Recently, recognizing affects from both face and body gestures attracts more attentions.
However, it still lacks of efficient and effective features to describe the dynamics of face and …

Prospects and limitations of full-text index structures in genome analysis

M Vyverman, B De Baets, V Fack… - Nucleic acids …, 2012 - academic.oup.com
The combination of incessant advances in sequencing technology producing large amounts
of data and innovative bioinformatics approaches, designed to cope with this data flood, has …

String matching in hardware using the FM-index

E Fernandez, W Najjar, S Lonardi - 2011 IEEE 19th Annual …, 2011 - ieeexplore.ieee.org
String matching is a ubiquitous problem that arises in a wide range of applications in
computing, eg, packet routing, intrusion detection, web querying, and genome analysis. Due …

Word-wise handwritten Persian and Roman script identification

K Roy, A Alaei, U Pal - 2010 12th International Conference on …, 2010 - ieeexplore.ieee.org
Most of the countries use bi-script documents. This is because every country uses its own
national language and English as second/foreign language. Therefore, bi-lingual document …

Simple compression code supporting random access and fast string matching

K Fredriksson, F Nikitin - … : 6th International Workshop, WEA 2007, Rome …, 2007 - Springer
Given a sequence S of n symbols over some alphabet Σ, we develop a new compression
method that is (i) very simple to implement;(ii) provides O (1) time random access to any …

A compressed self-index using a Ziv–Lempel dictionary

LMS Russo, AL Oliveira - Information Retrieval, 2008 - Springer
A compressed full-text self-index for a text T, of size u, is a data structure used to search for
patterns P, of size m, in T, that requires reduced space, ie space that depends on the …