Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

Fedavg with fine tuning: Local updates lead to representation learning

L Collins, H Hassani, A Mokhtari… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract The Federated Averaging (FedAvg) algorithm, which consists of alternating
between a few local stochastic gradient updates at client nodes, followed by a model …

Learning to generate image embeddings with user-level differential privacy

Z Xu, M Collins, Y Wang, L Panait… - Proceedings of the …, 2023 - openaccess.thecvf.com
Small on-device models have been successfully trained with user-level differential privacy
(DP) for next word prediction and image classification tasks in the past. However, existing …

A conditional gradient-based method for simple bilevel optimization with convex lower-level problem

R Jiang, N Abolfazli, A Mokhtari… - International …, 2023 - proceedings.mlr.press
In this paper, we study a class of bilevel optimization problems, also known as simple bilevel
optimization, where we minimize a smooth objective function over the optimal solution set of …

[HTML][HTML] Provable multi-task representation learning by two-layer relu neural networks

L Collins, H Hassani, M Soltanolkotabi… - … of machine learning …, 2024 - pmc.ncbi.nlm.nih.gov
An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on
many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear …

Holistic transfer: towards non-disruptive fine-tuning with partial target data

CH Tu, HY Chen, Z Mai, J Zhong… - Advances in …, 2024 - proceedings.neurips.cc
We propose a learning problem involving adapting a pre-trained source model to the target
domain for classifying all classes that appeared in the source data, using target data that …

Metalearning with very few samples per task

M Aliakbarpour, K Bairaktari, G Brown… - The Thirty Seventh …, 2024 - proceedings.mlr.press
Metalearning and multitask learning are two frameworks for solving a group of related
learning tasks more efficiently than we could hope to solve each of the individual tasks on …

Generalization error for portable rewards in transfer imitation learning

Y Zhou, L Wang, M Lu, Z Xu, J Tang, Y Zhang… - Knowledge-Based …, 2024 - Elsevier
The reward transfer paradigm in transfer imitation learning (TIL) leverages the reward
learned via inverse reinforcement learning (IRL) in the source environment to re-optimize a …

Understanding inverse scaling and emergence in multitask representation learning

ME Ildiz, Z Zhao, S Oymak - International Conference on …, 2024 - proceedings.mlr.press
Large language models exhibit strong multitasking capabilities, however, their learning
dynamics as a function of task characteristics, sample size, and model complexity remain …

Provable pathways: Learning multiple tasks over multiple paths

Y Li, S Oymak - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Constructing useful representations across a large number of tasks is a key requirement for
sample-efficient intelligent systems. A traditional idea in multitask learning (MTL) is building …