Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Physics-informed machine learning: case studies for weather and climate modelling

K Kashinath, M Mustafa, A Albert… - … of the Royal …, 2021 - royalsocietypublishing.org
Machine learning (ML) provides novel and powerful ways of accurately and efficiently
recognizing complex patterns, emulating nonlinear dynamics, and predicting the spatio …

Are emergent abilities of large language models a mirage?

R Schaeffer, B Miranda… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recent work claims that large language models display\textit {emergent abilities}, abilities
not present in smaller-scale models that are present in larger-scale models. What makes …

Scaling deep learning for materials discovery

A Merchant, S Batzner, SS Schoenholz, M Aykol… - Nature, 2023 - nature.com
Novel functional materials enable fundamental breakthroughs across technological
applications from clean energy to information processing,,,,,,,,,–. From microchips to batteries …

Reproducible scaling laws for contrastive language-image learning

M Cherti, R Beaumont, R Wightman… - Proceedings of the …, 2023 - openaccess.thecvf.com
Scaling up neural networks has led to remarkable performance across a wide range of
tasks. Moreover, performance often follows reliable scaling laws as a function of training set …

Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science
Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arxiv preprint arxiv …, 2022 - arxiv.org
Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

Textbooks are all you need

S Gunasekar, Y Zhang, J Aneja, CCT Mendes… - arxiv preprint arxiv …, 2023 - arxiv.org
We introduce phi-1, a new large language model for code, with significantly smaller size
than competing models: phi-1 is a Transformer-based model with 1.3 B parameters, trained …

Beyond neural scaling laws: beating power law scaling via data pruning

B Sorscher, R Geirhos, S Shekhar… - Advances in …, 2022 - proceedings.neurips.cc
Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …

Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks

Y Wang, S Mishra, P Alipoormolabashi, Y Kordi… - arxiv preprint arxiv …, 2022 - arxiv.org
How well can NLP models generalize to a variety of unseen tasks when provided with task
instructions? To address this question, we first introduce Super-NaturalInstructions, a …