Leave no document behind: Benchmarking long-context llms with extended multi-doc qa

M Wang, L Chen, F Cheng, S Liao… - Proceedings of the …, 2024 - aclanthology.org
Long-context modeling capabilities of Large Language Models (LLMs) have garnered
widespread attention, leading to the emergence of LLMs with ultra-context windows …

[PDF][PDF] Babilong: Testing the limits of llms with long context reasoning-in-a-haystack

Y Kuratov, A Bulatov, P Anokhin… - The Thirty-eight …, 2024 - proceedings.neurips.cc
In recent years, the input context sizes of large language models (LLMs) have increased
dramatically. However, existing evaluation methods have not kept pace, failing to …

Long context is not long at all: A prospector of long-dependency data for large language models

L Chen, Z Liu, W He, Y Li, R Luo, M Yang - arxiv preprint arxiv …, 2024 - arxiv.org
Long-context modeling capabilities are important for large language models (LLMs) in
various applications. However, directly training LLMs with long context windows is …

Evaluating Top-k RAG-based approach for Game Review Generation

P Chauhan, RK Sahani, S Datta, A Qadir… - 2024 IEEE …, 2024 - ieeexplore.ieee.org
Having access to public opinion for a particular product can be a cumbersome task. There
are multiple reviews for the same product. Some may be good or bad depending on the bias …

LCFO: Long context and long form output dataset and benchmarking

MR Costa-jussà, P Andrews, MC Meglioli… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper presents the Long Context and Form Output (LCFO) benchmark, a novel
evaluation framework for assessing gradual summarization and summary expansion …

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

M Suri, P Mathur, F Dernoncourt, K Goswami… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding information from a collection of multiple documents, particularly those with
visually rich elements, is important for document-grounded question answering. This paper …

Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks

Z Yang - arxiv preprint arxiv:2409.06338, 2024 - arxiv.org
We argue that there are two major distinct capabilities in long context understanding:
retrieval and holistic understanding. Understanding and further improving LLMs' long …