Information retrieval: recent advances and beyond
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …
utilized in the first and second stages of the typical information retrieval processing chain …
Text and code embeddings by contrastive pre-training
Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …
computing text similarity. Previous work typically trains models customized for different use …
Colbertv2: Effective and efficient retrieval via lightweight late interaction
Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …
intensive language tasks. While many neural IR methods encode queries and documents …
Large dual encoders are generalizable retrievers
It has been shown that dual encoders trained on one domain often fail to generalize to other
domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual …
domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual …
Promptagator: Few-shot dense retrieval from 8 examples
Much recent research on information retrieval has focused on how to transfer from one task
(typically with abundant supervised data) to various other tasks where supervision is limited …
(typically with abundant supervised data) to various other tasks where supervision is limited …
On the risk of misinformation pollution with large language models
In this paper, we comprehensively investigate the potential misuse of modern Large
Language Models (LLMs) for generating credible-sounding misinformation and its …
Language Models (LLMs) for generating credible-sounding misinformation and its …
Dense text retrieval based on pretrained language models: A survey
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …
required to return relevant information resources to user's queries in natural language. From …
Autoregressive search engines: Generating substrings as document identifiers
Abstract Knowledge-intensive language tasks require NLP systems to both provide the
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …
Conversational information seeking
Conversational information seeking (CIS) is concerned with a sequence of interactions
between one or more users and an information system. Interactions in CIS are primarily …
between one or more users and an information system. Interactions in CIS are primarily …
Query performance prediction for neural IR: Are we there yet?
Abstract Evaluation in Information Retrieval (IR) relies on post-hoc empirical procedures,
which are time-consuming and expensive operations. To alleviate this, Query Performance …
which are time-consuming and expensive operations. To alleviate this, Query Performance …