A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
Semantic structure in deep learning
E Pavlick - Annual Review of Linguistics, 2022 - annualreviews.org
Deep learning has recently come to dominate computational linguistics, leading to claims of
human-level performance in a range of language processing tasks. Like much previous …
human-level performance in a range of language processing tasks. Like much previous …
Language is not all you need: Aligning perception with language models
A big convergence of language, multimodal perception, action, and world modeling is a key
step toward artificial general intelligence. In this work, we introduce KOSMOS-1, a …
step toward artificial general intelligence. In this work, we introduce KOSMOS-1, a …
Is ChatGPT a general-purpose natural language processing task solver?
Spurred by advancements in scale, large language models (LLMs) have demonstrated the
ability to perform a variety of natural language processing (NLP) tasks zero-shot--ie, without …
ability to perform a variety of natural language processing (NLP) tasks zero-shot--ie, without …
Symbolic discovery of optimization algorithms
We present a method to formulate algorithm discovery as program search, and apply it to
discover optimization algorithms for deep neural network training. We leverage efficient …
discover optimization algorithms for deep neural network training. We leverage efficient …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …
capabilities with increasing scale. Despite their potentially transformative impact, these new …
Fine-tuning language models with just forward passes
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning
Few-shot in-context learning (ICL) enables pre-trained language models to perform a
previously-unseen task without any gradient-based training by feeding a small number of …
previously-unseen task without any gradient-based training by feeding a small number of …
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …
With a few demonstration input-label pairs, they can predict the label for an unseen input …