Domain specialization as the key to make large language models disruptive: A comprehensive survey

C Ling, X Zhao, J Lu, C Deng, C Zheng, J Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have significantly advanced the field of natural language
processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of …

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

[PDF][PDF] Retrieval-augmented generation for large language models: A survey

Y Gao, Y **ong, X Gao, K Jia, J Pan, Y Bi… - arxiv preprint arxiv …, 2023 - simg.baai.ac.cn
Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

Improving text embeddings with large language models

L Wang, N Yang, X Huang, L Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we introduce a novel and simple method for obtaining high-quality text
embeddings using only synthetic data and less than 1k training steps. Unlike existing …

Text embeddings by weakly-supervised contrastive pre-training

L Wang, N Yang, X Huang, B Jiao, L Yang… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a
wide range of tasks. The model is trained in a contrastive manner with weak supervision …

Large language models for information retrieval: A survey

Y Zhu, H Yuan, S Wang, J Liu, W Liu, C Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Large language models are effective text rankers with pairwise ranking prompting

Z Qin, R Jagerman, K Hui, H Zhuang, J Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Ranking documents using Large Language Models (LLMs) by directly feeding the query and
candidate documents into the prompt is an interesting and practical problem. However …

A survey on knowledge distillation of large language models

X Xu, M Li, C Tao, T Shen, R Cheng, J Li, C Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a
pivotal methodology for transferring advanced capabilities from leading proprietary LLMs …