Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arxiv preprint arxiv …, 2021 - arxiv.org
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Coco-dr: Combating distribution shifts in zero-shot dense retrieval with contrastive and distributionally robust learning

Y Yu, C **ong, S Sun, C Zhang, A Overwijk - arxiv preprint arxiv …, 2022 - arxiv.org
We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the
generalization ability of dense retrieval by combating the distribution shifts between source …

Laprador: Unsupervised pretrained dense retriever for zero-shot text retrieval

C Xu, D Guo, N Duan, J McAuley - arxiv preprint arxiv:2203.06169, 2022 - arxiv.org
In this paper, we propose LaPraDoR, a pretrained dual-tower dense retriever that does not
require any supervised data for training. Specifically, we first present Iterative Contrastive …

A french corpus for event detection on twitter

B Mazoyer, J Cagé, N Hervé, C Hudelot - 2020 - sciencespo.hal.science
We present Event2018, a corpus annotated for event detection tasks, consisting of 38 million
tweets in French (retweets excluded) including more than 130,000 tweets manually …

Domain adaptation for memory-efficient dense retrieval

N Thakur, N Reimers, J Lin - arxiv preprint arxiv:2205.11498, 2022 - arxiv.org
Dense retrievers encode documents into fixed dimensional embeddings. However, storing
all the document embeddings within an index produces bulky indexes which are expensive …

Learning list-level domain-invariant representations for ranking

R **an, H Zhuang, Z Qin, H Zamani… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Domain adaptation aims to transfer the knowledge learned on (data-rich) source
domains to (low-resource) target domains, and a popular method is invariant representation …

Plot retrieval as an assessment of abstract semantic association

S Xu, L Pang, J Li, M Yu, F Meng, H Shen… - arxiv preprint arxiv …, 2023 - arxiv.org
Retrieving relevant plots from the book for a query is a critical task, which can improve the
reading experience and efficiency of readers. Readers usually only give an abstract and …

Augmenting zero-shot dense retrievers with plug-in mixture-of-memories

S Ge, C **ong, C Rosset, A Overwijk, J Han… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper we improve the zero-shot generalization ability of language models via Mixture-
Of-Memory Augmentation (MoMA), a mechanism that retrieves augmentation documents …