Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C **ong… - Advances in neural …, 2023 - proceedings.neurips.cc
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … conference on machine …, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging

S Azizi, L Culp, J Freyberg, B Mustafa, S Baur… - Nature Biomedical …, 2023 - nature.com
Abstract Machine-learning models for medical tasks can match or surpass the performance
of clinical experts. However, in settings differing from those of the training dataset, the …

What makes multi-modal learning better than single (provably)

Y Huang, C Du, Z Xue, X Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …

Universal prompt tuning for graph neural networks

T Fang, Y Zhang, Y Yang, C Wang… - Advances in Neural …, 2023 - proceedings.neurips.cc
In recent years, prompt tuning has sparked a research surge in adapting pre-trained models.
Unlike the unified pre-training strategy employed in the language field, the graph field …

Rethinking few-shot image classification: a good embedding is all you need?

Y Tian, Y Wang, D Krishnan, JB Tenenbaum… - Computer Vision–ECCV …, 2020 - Springer
The focus of recent meta-learning research has been on the development of learning
algorithms that can quickly adapt to test time tasks with limited data and low computational …

A kernel-based view of language model fine-tuning

S Malladi, A Wettig, D Yu, D Chen… - … on Machine Learning, 2023 - proceedings.mlr.press
It has become standard to solve NLP tasks by fine-tuning pre-trained language models
(LMs), especially in low-data settings. There is minimal theoretical understanding of …

Fedavg with fine tuning: Local updates lead to representation learning

L Collins, H Hassani, A Mokhtari… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract The Federated Averaging (FedAvg) algorithm, which consists of alternating
between a few local stochastic gradient updates at client nodes, followed by a model …

Variational model inversion attacks

KC Wang, Y Fu, K Li, A Khisti… - Advances in Neural …, 2021 - proceedings.neurips.cc
Given the ubiquity of deep neural networks, it is important that these models do not reveal
information about sensitive data that they have been trained on. In model inversion attacks …

On the theory of transfer learning: The importance of task diversity

N Tripuraneni, M Jordan, C ** - Advances in neural …, 2020 - proceedings.neurips.cc
We provide new statistical guarantees for transfer learning via representation learning--
when transfer is achieved by learning a feature representation shared across different tasks …