Tool learning with large language models: A survey

C Qu, S Dai, X Wei, H Cai, S Wang, D Yin, J Xu… - Frontiers of Computer …, 2025 - Springer
Recently, tool learning with large language models (LLMs) has emerged as a promising
paradigm for augmenting the capabilities of LLMs to tackle highly complex problems …

Retrieval-augmented generation for natural language processing: A survey

S Wu, Y **ong, Y Cui, H Wu, C Chen, Y Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …

Improving text embeddings with large language models

L Wang, N Yang, X Huang, L Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we introduce a novel and simple method for obtaining high-quality text
embeddings using only synthetic data and less than 1k training steps. Unlike existing …

Searching for best practices in retrieval-augmented generation

X Wang, Z Wang, X Gao, F Zhang, Y Wu… - Proceedings of the …, 2024 - aclanthology.org
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating
up-to-date information, mitigating hallucinations, and enhancing response quality …

Longrag: Enhancing retrieval-augmented generation with long-context llms

Z Jiang, X Ma, W Chen - arxiv preprint arxiv:2406.15319, 2024 - arxiv.org
In traditional RAG framework, the basic retrieval units are normally short. The common
retrievers like DPR normally work with 100-word Wikipedia paragraphs. Such a design …

Longllmlingua: Accelerating and enhancing llms in long context scenarios via prompt compression

H Jiang, Q Wu, X Luo, D Li, CY Lin, Y Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
In long context scenarios, large language models (LLMs) face three main challenges: higher
computational/financial cost, longer latency, and inferior performance. Some studies reveal …

Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation

J Chen, S **ao, P Zhang, K Luo, D Lian… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we present a new embedding model, called M3-Embedding, which is
distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. It …

When large language models meet vector databases: A survey

Z **g, Y Su, Y Han, B Yuan, H Xu, C Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
This survey explores the synergistic potential of Large Language Models (LLMs) and Vector
Databases (VecDBs), a burgeoning but rapidly evolving research area. With the proliferation …

Llama2vec: Unsupervised adaptation of large language models for dense retrieval

C Li, Z Liu, S **ao, Y Shao, D Lian - … of the 62nd Annual Meeting of …, 2024 - aclanthology.org
Dense retrieval calls for discriminative embeddings to represent the semantic relationship
between query and document. It may benefit from the using of large language models …

Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arxiv preprint arxiv …, 2024 - arxiv.org
The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …