Meta-learned models of cognition
Psychologists and neuroscientists extensively rely on computational models for studying
and analyzing the human mind. Traditionally, such computational models have been hand …
and analyzing the human mind. Traditionally, such computational models have been hand …
Collective intelligence for deep learning: A survey of recent developments
In the past decade, we have witnessed the rise of deep learning to dominate the field of
artificial intelligence. Advances in artificial neural networks alongside corresponding …
artificial intelligence. Advances in artificial neural networks alongside corresponding …
Transformers learn in-context by gradient descent
At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …
Transformers as statisticians: Provable in-context learning with in-context algorithm selection
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …
What can transformers learn in-context? a case study of simple function classes
In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …
in-context examples (input-output pairs corresponding to some task) along with a new query …
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …
With a few demonstration input-label pairs, they can predict the label for an unseen input …
Transformers as algorithms: Generalization and stability in in-context learning
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …
Learning to (learn at test time): Rnns with expressive hidden states
Self-attention performs well in long context but has quadratic complexity. Existing RNN
layers have linear complexity, but their performance in long context is limited by the …
layers have linear complexity, but their performance in long context is limited by the …
Linear transformers are secretly fast weight programmers
We show the formal equivalence of linearised self-attention mechanisms and fast weight
controllers from the early'90s, where a slow neural net learns by gradient descent to …
controllers from the early'90s, where a slow neural net learns by gradient descent to …
[HTML][HTML] Deep language models for interpretative and predictive materials science
Machine learning (ML) has emerged as an indispensable methodology to describe,
discover, and predict complex physical phenomena that efficiently help us learn underlying …
discover, and predict complex physical phenomena that efficiently help us learn underlying …