Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Approximate nearest neighbor negative contrastive learning for dense text retrieval

L **ong, C **ong, Y Li, KF Tang, J Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Conducting text retrieval in a dense learned representation space has many intriguing
advantages over sparse retrieval. Yet the effectiveness of dense retrieval (DR) often requires …

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arxiv preprint arxiv …, 2021 - arxiv.org
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Colbert: Efficient and effective passage search via contextualized late interaction over bert

O Khattab, M Zaharia - Proceedings of the 43rd International ACM SIGIR …, 2020 - dl.acm.org
Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances
in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for …

RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering

Y Qu, Y Ding, J Liu, K Liu, R Ren, WX Zhao… - arxiv preprint arxiv …, 2020 - arxiv.org
In open-domain question answering, dense passage retrieval has become a new paradigm
to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is …

Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations

J Lin, X Ma, SC Lin, JH Yang, R Pradeep… - Proceedings of the 44th …, 2021 - dl.acm.org
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and
dense representations. It aims to provide effective, reproducible, and easy-to-use first-stage …

[BOOK][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Overview of the TREC 2019 deep learning track

N Craswell, B Mitra, E Yilmaz, D Campos… - arxiv preprint arxiv …, 2020 - arxiv.org
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc
ranking in a large data regime. It is the first track with large human-labeled training sets …

Rocketqav2: A joint training method for dense passage retrieval and passage re-ranking

R Ren, Y Qu, J Liu, WX Zhao, Q She, H Wu… - arxiv preprint arxiv …, 2021 - arxiv.org
In various natural language processing tasks, passage retrieval and passage re-ranking are
two key procedures in finding and ranking relevant information. Since both the two …