Efficient frameworks for generalized low-rank matrix bandit problems

Y Kang, CJ Hsieh, TCM Lee - Advances in Neural …, 2022 - proceedings.neurips.cc
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …

Near-optimal representation learning for linear bandits and linear rl

J Hu, X Chen, C **, L Li… - … Conference on Machine …, 2021 - proceedings.mlr.press
This paper studies representation learning for multi-task linear bandits and multi-task
episodic RL with linear value function approximation. We first consider the setting where we …

Spectral entry-wise matrix estimation for low-rank reinforcement learning

S Stojanovic, Y Jedra… - Advances in Neural …, 2023 - proceedings.neurips.cc
We study matrix estimation problems arising in reinforcement learning with low-rank
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …

Multi-task representation learning for pure exploration in bilinear bandits

S Mukherjee, Q **e, J Hanna… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study multi-task representation learning for the problem of pure exploration in bilinear
bandits. In bilinear bandits, an action takes theform of a pair of arms from two different entity …

Impact of representation learning in linear bandits

J Yang, W Hu, JD Lee, SS Du - arxiv preprint arxiv:2010.06531, 2020 - arxiv.org
We study how representation learning can improve the efficiency of bandit problems. We
study the setting where we play $ T $ linear bandits with dimension $ d $ concurrently, and …

Low-rank generalized linear bandit problems

Y Lu, A Meisami, A Tewari - International Conference on …, 2021 - proceedings.mlr.press
In a low-rank linear bandit problem, the reward of an action (represented by a matrix of size
$ d_1\times d_2 $) is the inner product between the action and an unknown low-rank matrix …

A simple unified framework for high dimensional bandit problems

W Li, A Barik, J Honorio - International Conference on …, 2022 - proceedings.mlr.press
Stochastic high dimensional bandit problems with low dimensional structures are useful in
different applications such as online advertising and drug discovery. In this work, we …

Doubly high-dimensional contextual bandits: An interpretable model for joint assortment-pricing

J Cai, R Chen, MJ Wainwright, L Zhao - arxiv preprint arxiv:2309.08634, 2023 - arxiv.org
Key challenges in running a retail business include how to select products to present to
consumers (the assortment problem), and how to price products (the pricing problem) to …

Nearly minimax algorithms for linear bandits with shared representation

J Yang, Q Lei, JD Lee, SS Du - arxiv preprint arxiv:2203.15664, 2022 - arxiv.org
We give novel algorithms for multi-task and lifelong linear bandits with shared
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …

Optimal algorithms for latent bandits with cluster structure

S Pal, AS Suggala, K Shanmugam… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We consider the problem of latent bandits with cluster structure where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped into …