Context embeddings for efficient answer generation in rag

D Rau, S Wang, H Déjean, S Clinchant - arxiv preprint arxiv:2407.09252, 2024 - arxiv.org
Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs
by extending the input with external information. As a consequence, the contextual inputs to …

Provence: efficient and robust context pruning for retrieval-augmented generation

N Chirkova, T Formal, V Nikoulina… - arxiv preprint arxiv …, 2025 - arxiv.org
Retrieval-augmented generation improves various aspects of large language models
(LLMs) generation, but suffers from computational overhead caused by long contexts as well …

HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions

J Mozafari, B Piryani, A Abdallah, A Jatowt - arxiv preprint arxiv …, 2025 - arxiv.org
Large Language Models (LLMs) are transforming how people find information, and many
users turn nowadays to chatbots to obtain answers to their questions. Despite the instant …

PISCO: Pretty Simple Compression for Retrieval-Augmented Generation

M Louis, H Déjean, S Clinchant - arxiv preprint arxiv:2501.16075, 2025 - arxiv.org
Retrieval-Augmented Generation (RAG) pipelines enhance Large Language Models (LLMs)
by retrieving relevant documents, but they face scalability issues due to high inference costs …

[PDF][PDF] Benchmarking of Retrieval Augmented Generation: A Comprehensive Systematic Literature Review on Evaluation Dimensions, Evaluation Metrics and …

S Knollmeyer, O Caymazer, L Koval, MU Akmal, S Asif… - scitepress.org
Despite the rapid advancements in the field of Large Language Models (LLM), traditional
benchmarks have proven to be inadequate for assessing the performance of Retrieval …