- Academic Search

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier

Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Zapisz Cytuj Cytowane przez 253 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]

[PDF] arxiv.org

Evaluating large language models: A comprehensive survey

Z Guo, R **, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

Zapisz Cytuj Cytowane przez 129 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Efficient streaming language models with attention sinks

G **ao, Y Tian, B Chen, S Han, M Lewis - arxiv preprint arxiv:2309.17453, 2023 - arxiv.org

Deploying Large Language Models (LLMs) in streaming applications such as multi-round
dialogue, where long interactions are expected, is urgently needed but poses two major …

Zapisz Cytuj Cytowane przez 401 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] ymcui.com

Efficient and effective text encoding for chinese llama and alpaca

Y Cui, Z Yang, X Yao - arxiv preprint arxiv:2304.08177, 2023 - arxiv.org

Large Language Models (LLMs), such as ChatGPT and GPT-4, have dramatically
transformed natural language processing research and shown promising strides towards …

Zapisz Cytuj Cytowane przez 269 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org

The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

Zapisz Cytuj Cytowane przez 101 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] neurips.cc

Benchmarking foundation models with language-model-as-an-examiner

Y Bai, J Ying, Y Cao, X Lv, Y He… - Advances in …, 2024 - proceedings.neurips.cc

Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …

Zapisz Cytuj Cytowane przez 112 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]

[PDF] aclanthology.org

∞ Bench: Extending long context evaluation beyond 100k tokens

X Zhang, Y Chen, S Hu, Z Xu, J Chen… - Proceedings of the …, 2024 - aclanthology.org

Processing and reasoning over long contexts is crucial for many practical applications of
Large Language Models (LLMs), such as document comprehension and agent construction …

Zapisz Cytuj Cytowane przez 30 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Lm-infinite: Simple on-the-fly length generalization for large language models

C Han, Q Wang, W **ong, Y Chen, H Ji… - arxiv preprint arxiv …, 2023 - arxiv.org

In recent years, there have been remarkable advancements in the performance of
Transformer-based Large Language Models (LLMs) across various domains. As these LLMs …

Zapisz Cytuj Cytowane przez 105 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] aclanthology.org

Leave no document behind: Benchmarking long-context llms with extended multi-doc qa

M Wang, L Chen, F Cheng, S Liao… - Proceedings of the …, 2024 - aclanthology.org

Long-context modeling capabilities of Large Language Models (LLMs) have garnered
widespread attention, leading to the emergence of LLMs with ultra-context windows …

Zapisz Cytuj Cytowane przez 25 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] openreview.net

[PDF][PDF] Soaring from 4k to 400k: Extending llm's context with activation beacon

P Zhang, Z Liu, S **ao, N Shao, Q Ye… - arxiv preprint arxiv …, 2024 - openreview.net

The utilization of long contexts poses a big challenge for large language models due to their
limited context window length. Although the context window can be extended through fine …

Zapisz Cytuj Cytowane przez 41 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

Evaluating large language models: A comprehensive survey

Efficient streaming language models with attention sinks

Efficient and effective text encoding for chinese llama and alpaca

RULER: What's the Real Context Size of Your Long-Context Language Models?

Benchmarking foundation models with language-model-as-an-examiner

∞ Bench: Extending long context evaluation beyond 100k tokens

Lm-infinite: Simple on-the-fly length generalization for large language models

Leave no document behind: Benchmarking long-context llms with extended multi-doc qa

[PDF][PDF] Soaring from 4k to 400k: Extending llm's context with activation beacon