Retrieval-augmented generation for large language models: A survey

Y Gao, Y **ong, X Gao, K Jia, J Pan, Y Bi, Y Dai… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

Domain specialization as the key to make large language models disruptive: A comprehensive survey

C Ling, X Zhao, J Lu, C Deng, C Zheng, J Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have significantly advanced the field of natural language
processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Large language models for information retrieval: A survey

Y Zhu, H Yuan, S Wang, J Liu, W Liu, C Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …

Text embeddings by weakly-supervised contrastive pre-training

L Wang, N Yang, X Huang, B Jiao, L Yang… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a
wide range of tasks. The model is trained in a contrastive manner with weak supervision …

Large language models are effective text rankers with pairwise ranking prompting

Z Qin, R Jagerman, K Hui, H Zhuang, J Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Ranking documents using Large Language Models (LLMs) by directly feeding the query and
candidate documents into the prompt is an interesting and practical problem. However …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Improving text embeddings with large language models

L Wang, N Yang, X Huang, L Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we introduce a novel and simple method for obtaining high-quality text
embeddings using only synthetic data and less than 1k training steps. Unlike existing …

Crud-rag: A comprehensive chinese benchmark for retrieval-augmented generation of large language models

Y Lyu, Z Li, S Niu, F **ong, B Tang, W Wang… - ACM Transactions on …, 2024 - dl.acm.org
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of
large language models (LLMs) by incorporating external knowledge sources. This method …

Task-aware retrieval with instructions

A Asai, T Schick, P Lewis, X Chen, G Izacard… - arxiv preprint arxiv …, 2022 - arxiv.org
We study the problem of retrieval with instructions, where users of a retrieval system
explicitly describe their intent along with their queries. We aim to develop a general-purpose …