Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C **ong… - Advances in neural …, 2023 - proceedings.neurips.cc
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging

S Azizi, L Culp, J Freyberg, B Mustafa, S Baur… - Nature Biomedical …, 2023 - nature.com
Abstract Machine-learning models for medical tasks can match or surpass the performance
of clinical experts. However, in settings differing from those of the training dataset, the …

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

Rethinking few-shot image classification: a good embedding is all you need?

Y Tian, Y Wang, D Krishnan, JB Tenenbaum… - Computer Vision–ECCV …, 2020 - Springer
The focus of recent meta-learning research has been on the development of learning
algorithms that can quickly adapt to test time tasks with limited data and low computational …

Universal prompt tuning for graph neural networks

T Fang, Y Zhang, Y Yang, C Wang… - Advances in Neural …, 2023 - proceedings.neurips.cc
In recent years, prompt tuning has sparked a research surge in adapting pre-trained models.
Unlike the unified pre-training strategy employed in the language field, the graph field …

What makes multi-modal learning better than single (provably)

Y Huang, C Du, Z Xue, X Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
The world provides us with data of multiple modalities. Intuitively, models fusing data from
different modalities outperform their uni-modal counterparts, since more information is …

Variational model inversion attacks

KC Wang, Y Fu, K Li, A Khisti… - Advances in Neural …, 2021 - proceedings.neurips.cc
Given the ubiquity of deep neural networks, it is important that these models do not reveal
information about sensitive data that they have been trained on. In model inversion attacks …

Revisiting scalarization in multi-task learning: A theoretical perspective

Y Hu, R **an, Q Wu, Q Fan, L Yin… - Advances in Neural …, 2023 - proceedings.neurips.cc
Linear scalarization, ie, combining all loss functions by a weighted sum, has been the
default choice in the literature of multi-task learning (MTL) since its inception. In recent years …

A kernel-based view of language model fine-tuning

S Malladi, A Wettig, D Yu, D Chen… - … on Machine Learning, 2023 - proceedings.mlr.press
It has become standard to solve NLP tasks by fine-tuning pre-trained language models
(LMs), especially in low-data settings. There is minimal theoretical understanding of …

Spectral methods for data science: A statistical perspective

Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …