Meta-learned models of cognition

M Binz, I Dasgupta, AK Jagadish… - Behavioral and Brain …, 2024 - cambridge.org
Psychologists and neuroscientists extensively rely on computational models for studying
and analyzing the human mind. Traditionally, such computational models have been hand …

Collective intelligence for deep learning: A survey of recent developments

D Ha, Y Tang - Collective Intelligence, 2022 - journals.sagepub.com
In the past decade, we have witnessed the rise of deep learning to dominate the field of
artificial intelligence. Advances in artificial neural networks alongside corresponding …

Transformers learn in-context by gradient descent

J Von Oswald, E Niklasson… - International …, 2023 - proceedings.mlr.press
At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C **ong… - Advances in neural …, 2024 - proceedings.neurips.cc
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

What can transformers learn in-context? a case study of simple function classes

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc
In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers

D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui… - arxiv preprint arxiv …, 2022 - arxiv.org
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

Learning to (learn at test time): Rnns with expressive hidden states

Y Sun, X Li, K Dalal, J Xu, A Vikram, G Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Self-attention performs well in long context but has quadratic complexity. Existing RNN
layers have linear complexity, but their performance in long context is limited by the …

Linear transformers are secretly fast weight programmers

I Schlag, K Irie, J Schmidhuber - International Conference on …, 2021 - proceedings.mlr.press
We show the formal equivalence of linearised self-attention mechanisms and fast weight
controllers from the early'90s, where a slow neural net learns by gradient descent to …

[HTML][HTML] Deep language models for interpretative and predictive materials science

Y Hu, MJ Buehler - APL Machine Learning, 2023 - pubs.aip.org
Machine learning (ML) has emerged as an indispensable methodology to describe,
discover, and predict complex physical phenomena that efficiently help us learn underlying …