Efficient inverted indexes for approximate retrieval over learned sparse representations
Learned sparse representations form an attractive class of contextual embeddings for text
retrieval. That is so because they are effective models of relevance and are interpretable by …
retrieval. That is so because they are effective models of relevance and are interpretable by …
Pairing Clustered Inverted Indexes with κ-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations
Learned sparse representations form an effective and interpretable class of embeddings for
text retrieval. While exact top-k retrieval over such embeddings faces efficiency challenges …
text retrieval. While exact top-k retrieval over such embeddings faces efficiency challenges …
SPLADE-v3: New baselines for SPLADE
A companion to the release of the latest version of the SPLADE library. We describe
changes to the training structure and present our latest series of models--SPLADE-v3. We …
changes to the training structure and present our latest series of models--SPLADE-v3. We …
A Noise-Oriented and Redundancy-Aware Instance Selection Framework
Fine-tuning transformer-based deep-learning models are currently at the forefront of natural
language processing (NLP) and information retrieval (IR) tasks. However, fine-tuning these …
language processing (NLP) and information retrieval (IR) tasks. However, fine-tuning these …
Enhancing Lexicon-Based Text Embeddings with Large Language Models
Recent large language models (LLMs) have demonstrated exceptional performance on
general-purpose text embedding tasks. While dense embeddings have dominated related …
general-purpose text embedding tasks. While dense embeddings have dominated related …
Rank-Biased Quality Measurement for Sets and Rankings
Experiments often result in the need to compare an observation against a reference, where
observation and reference are selections made from some specified domain. The goal is to …
observation and reference are selections made from some specified domain. The goal is to …
Investigating the Scalability of Approximate Sparse Retrieval Algorithms to Massive Datasets
Learned sparse text embeddings have gained popularity due to their effectiveness in top-k
retrieval and inherent interpretability. Their distributional idiosyncrasies, however, have long …
retrieval and inherent interpretability. Their distributional idiosyncrasies, however, have long …