Efficient inverted indexes for approximate retrieval over learned sparse representations

S Bruch, FM Nardini, C Rulli, R Venturini - Proceedings of the 47th …, 2024 - dl.acm.org
Learned sparse representations form an attractive class of contextual embeddings for text
retrieval. That is so because they are effective models of relevance and are interpretable by …

Faster learned sparse retrieval with block-max pruning

A Mallia, T Suel, N Tonellotto - … of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Learned sparse retrieval systems aim to combine the effectiveness of contextualized
language models with the scalability of conventional data structures such as inverted …

Threshold-driven Pruning with Segmented Maximum Term Weights for Approximate Cluster-based Sparse Retrieval

Y Qiao, P Carlson, S He, Y Yang… - Proceedings of the 2024 …, 2024 - aclanthology.org
This paper revisits dynamic pruning through rank score thresholding in cluster-based sparse
retrieval to skip the index partially at cluster and document levels during inference. It …

Neural Lexical Search with Learned Sparse Retrieval

A Yates, C Lassance, S MacAvaney… - Proceedings of the 2024 …, 2024 - dl.acm.org
Learned Sparse Retrieval (LSR) techniques use neural machinery to represent queries and
documents as learned bags of words. In contrast with other neural retrieval techniques, such …

Beyond Quantile Methods: Improved Top-K Threshold Estimation for Traditional and Learned Sparse Indexes

J Gou, Y Liu, M Shao, T Suel - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Top-k threshold estimation is the problem of estimating the score of the k-th highest ranking
result of a search query. A good estimate can be used to speed up many common top-k …

Pruning Optimization for Efficient Top-k Document Retrieval with Learned Sparse Representations

Y Qiao - 2024 - search.proquest.com
Efficiently searching for relevant documents on a large dataset typically employs an initial
retrieval stage to extract the most relevant candidates. This process often utilizes a sparse …