- Academic Search

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier

Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Speichern Zitieren Zitiert von: 252 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] arxiv.org

RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org

The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

Speichern Zitieren Zitiert von: 105 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Benchmarking foundation models with language-model-as-an-examiner

Y Bai, J Ying, Y Cao, X Lv, Y He… - Advances in …, 2024 - proceedings.neurips.cc

Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …

Speichern Zitieren Zitiert von: 113 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org

This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Speichern Zitieren Zitiert von: 59 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] openreview.net

Llm maybe longlm: Self-extend llm context window without tuning

H **, X Han, J Yang, Z Jiang, Z Liu, CY Chang… - arxiv preprint arxiv …, 2024 - arxiv.org

This work elicits LLMs' inherent ability to handle long contexts without fine-tuning. The
limited length of the training sequence during training may limit the application of Large …

Speichern Zitieren Zitiert von: 90 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Data engineering for scaling language models to 128k context

Y Fu, R Panda, X Niu, X Yue, H Hajishirzi, Y Kim… - arxiv preprint arxiv …, 2024 - arxiv.org

We study the continual pretraining recipe for scaling language models' context lengths to
128K, with a focus on data engineering. We hypothesize that long context modeling, in …

Speichern Zitieren Zitiert von: 67 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

One thousand and one pairs: A" novel" challenge for long-context language models

M Karpinska, K Thai, K Lo, T Goyal, M Iyyer - arxiv preprint arxiv …, 2024 - arxiv.org

Synthetic long-context LLM benchmarks (eg," needle-in-the-haystack") test only surface-
level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and …

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Think: Thinner key cache by query-driven pruning

Y Xu, Z Jie, H Dong, L Wang, X Lu, A Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have revolutionized the field of natural language
processing, achieving unprecedented performance across a variety of applications …

Speichern Zitieren Zitiert von: 9 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] openreview.net

Infllm: Training-free long-context extrapolation for llms with an efficient context memory

C **ao, P Zhang, X Han, G **ao, Y Lin… - The Thirty-eighth …, 2024 - openreview.net

Large language models (LLMs) have emerged as a cornerstone in real-world applications
with lengthy streaming inputs (eg, LLM-driven agents). However, existing LLMs, pre-trained …

Speichern Zitieren Zitiert von: 15 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] nature.com

Clinical entity augmented retrieval for clinical information extraction

I Lopez, A Swaminathan, K Vedula, S Narayanan… - npj Digital …, 2025 - nature.com

Large language models (LLMs) with retrieval-augmented generation (RAG) have improved
information extraction over previous methods, yet their reliance on embeddings often leads …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 5 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Longbench: A bilingual, multitask benchmark for long context understanding

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

RULER: What's the Real Context Size of Your Long-Context Language Models?

Benchmarking foundation models with language-model-as-an-examiner

Datasets for large language models: A comprehensive survey

Llm maybe longlm: Self-extend llm context window without tuning

Data engineering for scaling language models to 128k context

One thousand and one pairs: A" novel" challenge for long-context language models

Think: Thinner key cache by query-driven pruning

Infllm: Training-free long-context extrapolation for llms with an efficient context memory

Clinical entity augmented retrieval for clinical information extraction