- Academic Search

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel Alle 5 Versionen HTML-Version

Online clustering of bandits with misspecified user models

Z Wang, J **e, X Liu, S Li, J Lui - Advances in Neural …, 2023 - proceedings.neurips.cc

The contextual linear bandit is an important online learning problem where given arm
features, a learning agent selects an arm at each round to maximize the cumulative rewards …

Speichern Zitieren Zitiert von: 22 Ähnliche Artikel Alle 6 Versionen HTML-Version

Provable benefit of multitask representation learning in reinforcement learning

Y Cheng, S Feng, J Yang, H Zhang… - Advances in Neural …, 2022 - proceedings.neurips.cc

As representation learning becomes a powerful technique to reduce sample complexity in
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …

Speichern Zitieren Zitiert von: 22 Ähnliche Artikel Alle 6 Versionen HTML-Version

Multi-task representation learning with stochastic linear bandits

L Cella, K Lounici, G Pacreau… - … Conference on Artificial …, 2023 - proceedings.mlr.press

We study the problem of transfer-learning in the setting of stochastic linear contextual bandit
tasks. We consider that a low dimensional linear representation is shared across the tasks …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 10 Versionen HTML-Version

Anytime model selection in linear bandits

P Kassraie, N Emmenegger… - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Model selection in the context of bandit optimization is a challenging problem, as it
requires balancing exploration and exploitation not only for action selection, but also for …

Speichern Zitieren Zitiert von: 9 Ähnliche Artikel Alle 5 Versionen HTML-Version

Meta-learning hypothesis spaces for sequential decision-making

P Kassraie, J Rothfuss… - … Conference on Machine …, 2022 - proceedings.mlr.press

Obtaining reliable, adaptive confidence sets for prediction functions (hypotheses) is a central
challenge in sequential decision-making tasks, such as bandits and model-based …

Speichern Zitieren Zitiert von: 6 Ähnliche Artikel Alle 10 Versionen HTML-Version

Lifelong bandit optimization: no prior and no regret

F Schur, P Kassraie, J Rothfuss… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press

Abstract Machine learning algorithms are often repeatedly. applied to problems with similar
structure over and over again. We focus on solving a sequence of bandit optimization tasks …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 3 Versionen HTML-Version

Transportability for bandits with data from different environments

A Bellot, A Malek, S Chiappa - Advances in Neural …, 2023 - proceedings.neurips.cc

A unifying theme in the design of intelligent agents is to efficiently optimize a policy based on
what prior knowledge of the problem is available and what actions can be taken to learn …

Speichern Zitieren Ähnliche Artikel Alle 4 Versionen HTML-Version

Meta Learning in Bandits within shared affine Subspaces

S Bilaj, S Dhouib, S Maghsudi - International Conference on …, 2024 - proceedings.mlr.press

We study the problem of meta-learning several contextual stochastic bandits tasks by
leveraging their concentration around a low dimensional affine subspace, which we learn …