Text and code embeddings by contrastive pre-training

A Neelakantan, T Xu, R Puri, A Radford, JM Han… - arxiv preprint arxiv …, 2022 - arxiv.org
Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …

Large dual encoders are generalizable retrievers

J Ni, C Qu, J Lu, Z Dai, GH Ábrego, J Ma… - arxiv preprint arxiv …, 2021 - arxiv.org
It has been shown that dual encoders trained on one domain often fail to generalize to other
domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Rocketqav2: A joint training method for dense passage retrieval and passage re-ranking

R Ren, Y Qu, J Liu, WX Zhao, Q She, H Wu… - arxiv preprint arxiv …, 2021 - arxiv.org
In various natural language processing tasks, passage retrieval and passage re-ranking are
two key procedures in finding and ranking relevant information. Since both the two …

Improving the domain adaptation of retrieval augmented generation (RAG) models for open domain question answering

S Siriwardhana, R Weerasekera, E Wen… - Transactions of the …, 2023 - direct.mit.edu
Abstract Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain
Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia …

A neural corpus indexer for document retrieval

Y Wang, Y Hou, H Wang, Z Miao… - Advances in …, 2022 - proceedings.neurips.cc
Current state-of-the-art document retrieval solutions mainly follow an index-retrieve
paradigm, where the index is hard to be directly optimized for the final retrieval target. In this …

Improving passage retrieval with zero-shot question generation

DS Sachan, M Lewis, M Joshi, A Aghajanyan… - arxiv preprint arxiv …, 2022 - arxiv.org
We propose a simple and effective re-ranking method for improving passage retrieval in
open question answering. The re-ranker re-scores retrieved passages with a zero-shot …

End-to-end training of multi-document reader and retriever for open-domain question answering

D Singh, S Reddy, W Hamilton… - Advances in Neural …, 2021 - proceedings.neurips.cc
We present an end-to-end differentiable training method for retrieval-augmented open-
domain question answering systems that combine information from multiple retrieved …

Dense x retrieval: What retrieval granularity should we use?

T Chen, H Wang, S Chen, W Yu, K Ma, X Zhao… - arxiv preprint arxiv …, 2023 - arxiv.org
Dense retrieval has become a prominent method to obtain relevant context or world
knowledge in open-domain NLP tasks. When we use a learned dense retriever on a …

Adversarial retriever-ranker for dense text retrieval

H Zhang, Y Gong, Y Shen, J Lv, N Duan… - arxiv preprint arxiv …, 2021 - arxiv.org
Current dense text retrieval models face two typical challenges. First, they adopt a siamese
dual-encoder architecture to encode queries and documents independently for fast indexing …