[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4
KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
RULER: What's the Real Context Size of Your Long-Context Language Models?
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …
information (the" needle") from long distractor texts (the" haystack"), has been widely …
Benchmarking foundation models with language-model-as-an-examiner
Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …
models on open-ended question answering, which serves as a comprehensive test of a …
Datasets for large language models: A comprehensive survey
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …
Llm maybe longlm: Self-extend llm context window without tuning
This work elicits LLMs' inherent ability to handle long contexts without fine-tuning. The
limited length of the training sequence during training may limit the application of Large …
limited length of the training sequence during training may limit the application of Large …
Data engineering for scaling language models to 128k context
We study the continual pretraining recipe for scaling language models' context lengths to
128K, with a focus on data engineering. We hypothesize that long context modeling, in …
128K, with a focus on data engineering. We hypothesize that long context modeling, in …
One thousand and one pairs: A" novel" challenge for long-context language models
Synthetic long-context LLM benchmarks (eg," needle-in-the-haystack") test only surface-
level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and …
level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and …
Think: Thinner key cache by query-driven pruning
Large Language Models (LLMs) have revolutionized the field of natural language
processing, achieving unprecedented performance across a variety of applications …
processing, achieving unprecedented performance across a variety of applications …
Infllm: Training-free long-context extrapolation for llms with an efficient context memory
Large language models (LLMs) have emerged as a cornerstone in real-world applications
with lengthy streaming inputs (eg, LLM-driven agents). However, existing LLMs, pre-trained …
with lengthy streaming inputs (eg, LLM-driven agents). However, existing LLMs, pre-trained …
Clinical entity augmented retrieval for clinical information extraction
Large language models (LLMs) with retrieval-augmented generation (RAG) have improved
information extraction over previous methods, yet their reliance on embeddings often leads …
information extraction over previous methods, yet their reliance on embeddings often leads …