Scalable PAC-bayesian meta-learning via the PAC-optimal hyper-posterior: from theory to practice

J Rothfuss, M Josifoski, V Fortuin… - The Journal of Machine …, 2023 - dl.acm.org
Meta-Learning aims to speed up the learning process on new tasks by acquiring useful
inductive biases from datasets of related learning tasks. While, in practice, the number of …

Online clustering of bandits with misspecified user models

Z Wang, J **e, X Liu, S Li, J Lui - Advances in Neural …, 2023 - proceedings.neurips.cc
The contextual linear bandit is an important online learning problem where given arm
features, a learning agent selects an arm at each round to maximize the cumulative rewards …

Provable benefit of multitask representation learning in reinforcement learning

Y Cheng, S Feng, J Yang, H Zhang… - Advances in Neural …, 2022 - proceedings.neurips.cc
As representation learning becomes a powerful technique to reduce sample complexity in
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …

Multi-task representation learning with stochastic linear bandits

L Cella, K Lounici, G Pacreau… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We study the problem of transfer-learning in the setting of stochastic linear contextual bandit
tasks. We consider that a low dimensional linear representation is shared across the tasks …

Anytime model selection in linear bandits

P Kassraie, N Emmenegger… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Model selection in the context of bandit optimization is a challenging problem, as it
requires balancing exploration and exploitation not only for action selection, but also for …

Meta-learning hypothesis spaces for sequential decision-making

P Kassraie, J Rothfuss… - … Conference on Machine …, 2022 - proceedings.mlr.press
Obtaining reliable, adaptive confidence sets for prediction functions (hypotheses) is a central
challenge in sequential decision-making tasks, such as bandits and model-based …

Lifelong bandit optimization: no prior and no regret

F Schur, P Kassraie, J Rothfuss… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Abstract Machine learning algorithms are often repeatedly. applied to problems with similar
structure over and over again. We focus on solving a sequence of bandit optimization tasks …

Transportability for bandits with data from different environments

A Bellot, A Malek, S Chiappa - Advances in Neural …, 2023 - proceedings.neurips.cc
A unifying theme in the design of intelligent agents is to efficiently optimize a policy based on
what prior knowledge of the problem is available and what actions can be taken to learn …

Meta Learning in Bandits within shared affine Subspaces

S Bilaj, S Dhouib, S Maghsudi - International Conference on …, 2024 - proceedings.mlr.press
We study the problem of meta-learning several contextual stochastic bandits tasks by
leveraging their concentration around a low dimensional affine subspace, which we learn …

Meta representation learning with contextual linear bandits

L Cella, K Lounici, M Pontil - arxiv preprint arxiv:2205.15100, 2022 - arxiv.org
Meta-learning seeks to build algorithms that rapidly learn how to solve new learning
problems based on previous experience. In this paper we investigate meta-learning in the …