- Academic Search

A Albalak, Y Elazar, SM **e, S Longpre… - arxiv preprint arxiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

Speichern Zitieren Zitiert von: 74 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2023 - proceedings.neurips.cc

The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

Speichern Zitieren Zitiert von: 223 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Speichern Zitieren Zitiert von: 685 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Llemma: An open language model for mathematics

Z Azerbayev, H Schoelkopf, K Paster… - arxiv preprint arxiv …, 2023 - arxiv.org

We present Llemma, a large language model for mathematics. We continue pretraining
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …

Speichern Zitieren Zitiert von: 255 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Octopack: Instruction tuning code large language models

N Muennighoff, Q Liu, A Zebaze, Q Zheng… - arxiv preprint arxiv …, 2023 - arxiv.org

Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …

Speichern Zitieren Zitiert von: 182 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Aya 23: Open weight releases to further multilingual progress

V Aryabumi, J Dang, D Talupuru, S Dash… - arxiv preprint arxiv …, 2024 - arxiv.org

This technical report introduces Aya 23, a family of multilingual language models. Aya 23
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …

Speichern Zitieren Zitiert von: 59 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Having beer after prayer? measuring cultural bias in large language models

T Naous, MJ Ryan, A Ritter, W Xu - arxiv preprint arxiv:2305.14456, 2023 - arxiv.org

As the reach of large language models (LMs) expands globally, their ability to cater to
diverse cultural contexts becomes crucial. Despite advancements in multilingual …

Speichern Zitieren Zitiert von: 95 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Sabiá: Portuguese large language models

R Pires, H Abonizio, TS Almeida… - Brazilian Conference on …, 2023 - Springer

As the capabilities of language models continue to advance, it is conceivable that “one-size-
fits-all” model will remain as the main paradigm. For instance, given the vast number of …

Speichern Zitieren Zitiert von: 57 Ähnliche Artikel Alle 4 Versionen

[Free GPT-4]

[PDF] arxiv.org

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

Speichern Zitieren Zitiert von: 121 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Fingpt: Large generative models for a small language

R Luukkonen, V Komulainen, J Luoma… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) excel in many tasks in NLP and beyond, but most open
models have very limited coverage of smaller languages and LLM work tends to focus on …

Speichern Zitieren Zitiert von: 33 Ähnliche Artikel Alle 4 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Bloom+ 1: Adding language support to bloom for zero-shot prompting

A survey on data selection for language models

Scaling data-constrained language models

Crosslingual generalization through multitask finetuning

Llemma: An open language model for mathematics

Octopack: Instruction tuning code large language models

Aya 23: Open weight releases to further multilingual progress

Having beer after prayer? measuring cultural bias in large language models

Sabiá: Portuguese large language models

Aya model: An instruction finetuned open-access multilingual language model

Fingpt: Large generative models for a small language