Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Colbertv2: Effective and efficient retrieval via lightweight late interaction

K Santhanam, O Khattab, J Saad-Falcon… - arxiv preprint arxiv …, 2021 - arxiv.org
Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arxiv preprint arxiv …, 2021 - arxiv.org
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

Simplified data wrangling with ir_datasets

S MacAvaney, A Yates, S Feldman, D Downey… - Proceedings of the 44th …, 2021 - dl.acm.org
Managing the data for Information Retrieval (IR) experiments can be challenging. Dataset
documentation is scattered across the Internet and once one obtains a copy of the data …

Coco-dr: Combating distribution shifts in zero-shot dense retrieval with contrastive and distributionally robust learning

Y Yu, C **ong, S Sun, C Zhang, A Overwijk - arxiv preprint arxiv …, 2022 - arxiv.org
We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the
generalization ability of dense retrieval by combating the distribution shifts between source …

The information retrieval experiment platform

M Fröbe, JH Reimer, S MacAvaney, N Deckers… - Proceedings of the 46th …, 2023 - dl.acm.org
We integrate irdatasets, ir_measures, and PyTerrier with TIRA in the Information Retrieval
Experiment Platform (TIREx) to promote more standardized, reproducible, scalable, and …

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

J Lee, A Chen, Z Dai, D Dua, DS Sachan… - arxiv preprint arxiv …, 2024 - arxiv.org
Long-context language models (LCLMs) have the potential to revolutionize our approach to
tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging …

Overview of Touché 2021: argument retrieval

A Bondarenko, L Gienapp, M Fröbe, M Beloucif… - Experimental IR Meets …, 2021 - Springer
This paper is a condensed report on the second year of the Touché shared task on
argument retrieval held at CLEF 2021. With the goal to provide a collaborative platform for …

Genqrensemble: Zero-shot llm ensemble prompting for generative query reformulation

KD Dhole, E Agichtein - European Conference on Information Retrieval, 2024 - Springer
Query Reformulation (QR) is a set of techniques used to transform a user's original search
query to a text that better aligns with the user's intent and improves their search experience …

Laprador: Unsupervised pretrained dense retriever for zero-shot text retrieval

C Xu, D Guo, N Duan, J McAuley - arxiv preprint arxiv:2203.06169, 2022 - arxiv.org
In this paper, we propose LaPraDoR, a pretrained dual-tower dense retriever that does not
require any supervised data for training. Specifically, we first present Iterative Contrastive …