Scalable PAC-bayesian meta-learning via the PAC-optimal hyper-posterior: from theory to practice
Meta-Learning aims to speed up the learning process on new tasks by acquiring useful
inductive biases from datasets of related learning tasks. While, in practice, the number of …
inductive biases from datasets of related learning tasks. While, in practice, the number of …
Online clustering of bandits with misspecified user models
The contextual linear bandit is an important online learning problem where given arm
features, a learning agent selects an arm at each round to maximize the cumulative rewards …
features, a learning agent selects an arm at each round to maximize the cumulative rewards …
Provable benefit of multitask representation learning in reinforcement learning
As representation learning becomes a powerful technique to reduce sample complexity in
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …
Multi-task representation learning with stochastic linear bandits
We study the problem of transfer-learning in the setting of stochastic linear contextual bandit
tasks. We consider that a low dimensional linear representation is shared across the tasks …
tasks. We consider that a low dimensional linear representation is shared across the tasks …
Anytime model selection in linear bandits
Abstract Model selection in the context of bandit optimization is a challenging problem, as it
requires balancing exploration and exploitation not only for action selection, but also for …
requires balancing exploration and exploitation not only for action selection, but also for …
Meta-learning hypothesis spaces for sequential decision-making
Obtaining reliable, adaptive confidence sets for prediction functions (hypotheses) is a central
challenge in sequential decision-making tasks, such as bandits and model-based …
challenge in sequential decision-making tasks, such as bandits and model-based …
Lifelong bandit optimization: no prior and no regret
Abstract Machine learning algorithms are often repeatedly. applied to problems with similar
structure over and over again. We focus on solving a sequence of bandit optimization tasks …
structure over and over again. We focus on solving a sequence of bandit optimization tasks …
Transportability for bandits with data from different environments
A unifying theme in the design of intelligent agents is to efficiently optimize a policy based on
what prior knowledge of the problem is available and what actions can be taken to learn …
what prior knowledge of the problem is available and what actions can be taken to learn …
Meta Learning in Bandits within shared affine Subspaces
We study the problem of meta-learning several contextual stochastic bandits tasks by
leveraging their concentration around a low dimensional affine subspace, which we learn …
leveraging their concentration around a low dimensional affine subspace, which we learn …
Meta representation learning with contextual linear bandits
Meta-learning seeks to build algorithms that rapidly learn how to solve new learning
problems based on previous experience. In this paper we investigate meta-learning in the …
problems based on previous experience. In this paper we investigate meta-learning in the …