A survey on data selection for language models
A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
Llemma: An open language model for mathematics
We present Llemma, a large language model for mathematics. We continue pretraining
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …
Octopack: Instruction tuning code large language models
Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …
improvements on natural language tasks. We apply instruction tuning using code …
Aya 23: Open weight releases to further multilingual progress
This technical report introduces Aya 23, a family of multilingual language models. Aya 23
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
Having beer after prayer? measuring cultural bias in large language models
As the reach of large language models (LMs) expands globally, their ability to cater to
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
Sabiá: Portuguese large language models
As the capabilities of language models continue to advance, it is conceivable that “one-size-
fits-all” model will remain as the main paradigm. For instance, given the vast number of …
fits-all” model will remain as the main paradigm. For instance, given the vast number of …
Aya model: An instruction finetuned open-access multilingual language model
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
Fingpt: Large generative models for a small language
R Luukkonen, V Komulainen, J Luoma… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) excel in many tasks in NLP and beyond, but most open
models have very limited coverage of smaller languages and LLM work tends to focus on …
models have very limited coverage of smaller languages and LLM work tends to focus on …