Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Tool learning with large language models: A survey

C Qu, S Dai, X Wei, H Cai, S Wang, D Yin, J Xu… - Frontiers of Computer …, 2025 - Springer
Recently, tool learning with large language models (LLMs) has emerged as a promising
paradigm for augmenting the capabilities of LLMs to tackle highly complex problems …

Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation

J Chen, S **ao, P Zhang, K Luo, D Lian… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we present a new embedding model, called M3-Embedding, which is
distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. It …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Text and code embeddings by contrastive pre-training

A Neelakantan, T Xu, R Puri, A Radford, JM Han… - arxiv preprint arxiv …, 2022 - arxiv.org
Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …

Knowledge-augmented language model prompting for zero-shot knowledge graph question answering

J Baek, AF Aji, A Saffari - arxiv preprint arxiv:2306.04136, 2023 - arxiv.org
Large Language Models (LLMs) are capable of performing zero-shot closed-book question
answering tasks, based on their internal knowledge stored in parameters during pre …

Colbertv2: Effective and efficient retrieval via lightweight late interaction

K Santhanam, O Khattab, J Saad-Falcon… - arxiv preprint arxiv …, 2021 - arxiv.org
Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …

Large dual encoders are generalizable retrievers

J Ni, C Qu, J Lu, Z Dai, GH Ábrego, J Ma… - arxiv preprint arxiv …, 2021 - arxiv.org
It has been shown that dual encoders trained on one domain often fail to generalize to other
domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual …

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arxiv preprint arxiv …, 2021 - arxiv.org
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

Medcpt: Contrastive pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information retrieval

Q **, W Kim, Q Chen, DC Comeau, L Yeganova… - …, 2023 - academic.oup.com
Motivation Information retrieval (IR) is essential in biomedical knowledge acquisition and
clinical decision support. While recent progress has shown that language model encoders …