- Academic Search

A Albalak, Y Elazar, SM **e, S Longpre… - arxiv preprint arxiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

Save Cite Cited by 71 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

On-device language models: A comprehensive review

J Xu, Z Li, W Chen, Q Wang, X Gao, Q Cai… - arxiv preprint arxiv …, 2024 - arxiv.org

The advent of large language models (LLMs) revolutionized natural language processing
applications, and running LLMs on edge devices has become increasingly attractive for …

Save Cite Cited by 13 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models

M Deitke, C Clark, S Lee, R Tripathi, Y Yang… - arxiv preprint arxiv …, 2024 - arxiv.org

Today's most advanced multimodal models remain proprietary. The strongest open-weight
models rely heavily on synthetic data from proprietary VLMs to achieve good performance …

Save Cite Cited by 35 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Language models scale reliably with over-training and on downstream tasks

SY Gadre, G Smyrnis, V Shankar, S Gururangan… - arxiv preprint arxiv …, 2024 - arxiv.org

Scaling laws are useful guides for derisking expensive training runs, as they predict
performance of large models using cheaper, small-scale experiments. However, there …

Save Cite Cited by 20 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] hal.science

Consent in crisis: The rapid decline of the ai data commons

S Longpre, R Mahari, A Lee, C Lund, H Oderinwale… - NEURIPS, 2024 - hal.science

General-purpose artificial intelligence (AI) systems are built on massive swathes of public
web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge …

Save Cite Cited by 23 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Leave no context behind: Efficient infinite context transformers with infini-attention

T Munkhdalai, M Faruqui, S Gopal - arxiv preprint arxiv:2404.07143, 2024 - arxiv.org

This work introduces an efficient method to scale Transformer-based Large Language
Models (LLMs) to infinitely long inputs with bounded memory and computation. A key …

Save Cite Cited by 86 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nature.com

Generative language models exhibit social identity biases

T Hu, Y Kyrychenko, S Rathje, N Collier… - Nature Computational …, 2024 - nature.com

Social identity biases, particularly the tendency to favor one's own group (ingroup solidarity)
and derogate other groups (outgroup hostility), are deeply rooted in human psychology and …

Save Cite Cited by 19 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Generalization vs Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

X Wang, A Antoniades, Y Elazar, A Amayuelas… - arxiv preprint arxiv …, 2024 - arxiv.org

The impressive capabilities of large language models (LLMs) have sparked debate over
whether these models genuinely generalize to unseen tasks or predominantly rely on …

Save Cite Cited by 14 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] openreview.net

Position: Key claims in llm research have a long tail of footnotes

A Rogers, S Luccioni - Forty-first International Conference on …, 2024 - openreview.net

Much of the recent discourse within the ML community has been centered around Large
Language Models (LLMs), their functionality and potential--yet not only do we not have a …

Save Cite Cited by 8 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Redpajama: an open dataset for training large language models

M Weber, D Fu, Q Anthony, Y Oren, S Adams… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models are increasingly becoming a cornerstone technology in artificial
intelligence, the sciences, and society as a whole, yet the optimal strategies for dataset …

Save Cite Cited by 8 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Olmo: Accelerating the science of language models

A survey on data selection for language models

On-device language models: A comprehensive review

Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models

Language models scale reliably with over-training and on downstream tasks

Consent in crisis: The rapid decline of the ai data commons

Leave no context behind: Efficient infinite context transformers with infini-attention

Generative language models exhibit social identity biases

Generalization vs Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

Position: Key claims in llm research have a long tail of footnotes

Redpajama: an open dataset for training large language models