Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Learning to tokenize for generative retrieval

W Sun, L Yan, Z Chen, S Wang, H Zhu… - Advances in …, 2023 - proceedings.neurips.cc
As a new paradigm in information retrieval, generative retrieval directly generates a ranked
list of document identifiers (docids) for a given query using generative language models …

Efficiently teaching an effective dense retriever with balanced topic aware sampling

S Hofstätter, SC Lin, JH Yang, J Lin… - Proceedings of the 44th …, 2021 - dl.acm.org
A vital step towards the widespread adoption of neural retrieval models is their resource
efficiency throughout the training, indexing and query workflows. The neural IR community …

SPLADE: Sparse lexical and expansion model for first stage ranking

T Formal, B Piwowarski, S Clinchant - Proceedings of the 44th …, 2021 - dl.acm.org
In neural Information Retrieval, ongoing research is directed towards improving the first
retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using …

Autoregressive search engines: Generating substrings as document identifiers

M Bevilacqua, G Ottaviano, P Lewis… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Knowledge-intensive language tasks require NLP systems to both provide the
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …

Approximate nearest neighbor negative contrastive learning for dense text retrieval

L **ong, C **ong, Y Li, KF Tang, J Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Conducting text retrieval in a dense learned representation space has many intriguing
advantages over sparse retrieval. Yet the effectiveness of dense retrieval (DR) often requires …

Colbert: Efficient and effective passage search via contextualized late interaction over bert

O Khattab, M Zaharia - Proceedings of the 43rd International ACM SIGIR …, 2020 - dl.acm.org
Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances
in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for …

[BOK][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Optimizing dense retrieval model training with hard negatives

J Zhan, J Mao, Y Liu, J Guo, M Zhang… - Proceedings of the 44th …, 2021 - dl.acm.org
Ranking has always been one of the top concerns in information retrieval researches. For
decades, the lexical matching signal has dominated the ad-hoc retrieval process, but solely …