[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4
KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
Evaluating large language models: A comprehensive survey
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …
spectrum of tasks. They have attracted significant attention and been deployed in numerous …
Efficient streaming language models with attention sinks
Deploying Large Language Models (LLMs) in streaming applications such as multi-round
dialogue, where long interactions are expected, is urgently needed but poses two major …
dialogue, where long interactions are expected, is urgently needed but poses two major …
Efficient and effective text encoding for chinese llama and alpaca
Large Language Models (LLMs), such as ChatGPT and GPT-4, have dramatically
transformed natural language processing research and shown promising strides towards …
transformed natural language processing research and shown promising strides towards …
RULER: What's the Real Context Size of Your Long-Context Language Models?
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …
information (the" needle") from long distractor texts (the" haystack"), has been widely …
Benchmarking foundation models with language-model-as-an-examiner
Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …
models on open-ended question answering, which serves as a comprehensive test of a …
∞ Bench: Extending long context evaluation beyond 100k tokens
Processing and reasoning over long contexts is crucial for many practical applications of
Large Language Models (LLMs), such as document comprehension and agent construction …
Large Language Models (LLMs), such as document comprehension and agent construction …
Lm-infinite: Simple on-the-fly length generalization for large language models
In recent years, there have been remarkable advancements in the performance of
Transformer-based Large Language Models (LLMs) across various domains. As these LLMs …
Transformer-based Large Language Models (LLMs) across various domains. As these LLMs …
Leave no document behind: Benchmarking long-context llms with extended multi-doc qa
Long-context modeling capabilities of Large Language Models (LLMs) have garnered
widespread attention, leading to the emergence of LLMs with ultra-context windows …
widespread attention, leading to the emergence of LLMs with ultra-context windows …
[PDF][PDF] Soaring from 4k to 400k: Extending llm's context with activation beacon
The utilization of long contexts poses a big challenge for large language models due to their
limited context window length. Although the context window can be extended through fine …
limited context window length. Although the context window can be extended through fine …