Techniques for inverted index compression
The data structure at the core of large-scale search engines is the inverted index, which is
essentially a collection of sorted integer sequences called inverted lists. Because of the …
essentially a collection of sorted integer sequences called inverted lists. Because of the …
PISA: Performant indexes and search for academia
Performant Indexes and Search for Academia (PISA) is an experimental search engine that
focuses on efficient implementations of state-of-the-art representations and algorithms for …
focuses on efficient implementations of state-of-the-art representations and algorithms for …
Efficient query processing for scalable web search
Search engines are exceptionally important tools for accessing information in today's world.
In satisfying the information needs of millions of users, the effectiveness (the quality of the …
In satisfying the information needs of millions of users, the effectiveness (the quality of the …
Scalability challenges in web search engines
BB Cambazoglu, R Baeza-Yates - Advanced topics in information retrieval, 2011 - Springer
Continuous growth of the Web and user bases forces web search engine companies to
make costly investments on very large compute infrastructures. The scalability of these …
make costly investments on very large compute infrastructures. The scalability of these …
Faster BlockMax WAND with variable-sized blocks
Query processing is one of the main bottlenecks in large-scale search engines. Retrieving
the top k most relevant documents for a given query can be extremely expensive, as it …
the top k most relevant documents for a given query can be extremely expensive, as it …
[HTML][HTML] CoCo-trie: Data-aware compression and indexing of strings
We address the problem of compressing and indexing a sorted dictionary of strings to
support efficient lookups and more sophisticated operations, such as prefix, predecessor …
support efficient lookups and more sophisticated operations, such as prefix, predecessor …
Fast dictionary-based compression for inverted indexes
Dictionary-based compression schemes provide fast decoding operation, typically at the
expense of reduced compression effectiveness compared to statistical or probability-based …
expense of reduced compression effectiveness compared to statistical or probability-based …
Compressing and querying integer dictionaries under linearities and repetitions
We revisit the fundamental problem of compressing an integer dictionary that supports
efficient and operations by exploiting simultaneously two kinds of regularities arising in real …
efficient and operations by exploiting simultaneously two kinds of regularities arising in real …
Index compression using byte-aligned ANS coding and two-dimensional contexts
We examine approaches used for block-based inverted index compression, such as the
OptPFOR mechanism, in which fixed-length blocks of postings data are compressed …
OptPFOR mechanism, in which fixed-length blocks of postings data are compressed …
Clustered elias-fano indexes
State-of-the-art encoders for inverted indexes compress each posting list individually.
Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists …
Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists …