Efficient inverted indexes for approximate retrieval over learned sparse representations

S Bruch, FM Nardini, C Rulli, R Venturini - Proceedings of the 47th …, 2024 - dl.acm.org
Learned sparse representations form an attractive class of contextual embeddings for text
retrieval. That is so because they are effective models of relevance and are interpretable by …

Pairing Clustered Inverted Indexes with κ-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations

S Bruch, FM Nardini, C Rulli, R Venturini - Proceedings of the 33rd ACM …, 2024 - dl.acm.org
Learned sparse representations form an effective and interpretable class of embeddings for
text retrieval. While exact top-k retrieval over such embeddings faces efficiency challenges …

SPLADE-v3: New baselines for SPLADE

C Lassance, H Déjean, T Formal… - arxiv preprint arxiv …, 2024 - arxiv.org
A companion to the release of the latest version of the SPLADE library. We describe
changes to the training structure and present our latest series of models--SPLADE-v3. We …

A Noise-Oriented and Redundancy-Aware Instance Selection Framework

W Cunha, A Moreo, A Esuli, F Sebastiani… - ACM Transactions on …, 2025 - dl.acm.org
Fine-tuning transformer-based deep-learning models are currently at the forefront of natural
language processing (NLP) and information retrieval (IR) tasks. However, fine-tuning these …

Enhancing Lexicon-Based Text Embeddings with Large Language Models

Y Lei, T Shen, Y Cao, A Yates - arxiv preprint arxiv:2501.09749, 2025 - arxiv.org
Recent large language models (LLMs) have demonstrated exceptional performance on
general-purpose text embedding tasks. While dense embeddings have dominated related …

Rank-Biased Quality Measurement for Sets and Rankings

A Moffat, J Mackenzie, A Mallia, M Petri - Proceedings of the 2024 …, 2024 - dl.acm.org
Experiments often result in the need to compare an observation against a reference, where
observation and reference are selections made from some specified domain. The goal is to …

Investigating the Scalability of Approximate Sparse Retrieval Algorithms to Massive Datasets

S Bruch, FM Nardini, C Rulli, R Venturini… - arxiv preprint arxiv …, 2025 - arxiv.org
Learned sparse text embeddings have gained popularity due to their effectiveness in top-k
retrieval and inherent interpretability. Their distributional idiosyncrasies, however, have long …