Efficient frameworks for generalized low-rank matrix bandit problems
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …
is given by the inner product between the action's feature matrix and some fixed, but initially …
Near-optimal representation learning for linear bandits and linear rl
This paper studies representation learning for multi-task linear bandits and multi-task
episodic RL with linear value function approximation. We first consider the setting where we …
episodic RL with linear value function approximation. We first consider the setting where we …
Spectral entry-wise matrix estimation for low-rank reinforcement learning
We study matrix estimation problems arising in reinforcement learning with low-rank
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …
Multi-task representation learning for pure exploration in bilinear bandits
We study multi-task representation learning for the problem of pure exploration in bilinear
bandits. In bilinear bandits, an action takes theform of a pair of arms from two different entity …
bandits. In bilinear bandits, an action takes theform of a pair of arms from two different entity …
Impact of representation learning in linear bandits
We study how representation learning can improve the efficiency of bandit problems. We
study the setting where we play $ T $ linear bandits with dimension $ d $ concurrently, and …
study the setting where we play $ T $ linear bandits with dimension $ d $ concurrently, and …
Low-rank generalized linear bandit problems
In a low-rank linear bandit problem, the reward of an action (represented by a matrix of size
$ d_1\times d_2 $) is the inner product between the action and an unknown low-rank matrix …
$ d_1\times d_2 $) is the inner product between the action and an unknown low-rank matrix …
A simple unified framework for high dimensional bandit problems
Stochastic high dimensional bandit problems with low dimensional structures are useful in
different applications such as online advertising and drug discovery. In this work, we …
different applications such as online advertising and drug discovery. In this work, we …
Doubly high-dimensional contextual bandits: An interpretable model for joint assortment-pricing
Key challenges in running a retail business include how to select products to present to
consumers (the assortment problem), and how to price products (the pricing problem) to …
consumers (the assortment problem), and how to price products (the pricing problem) to …
Nearly minimax algorithms for linear bandits with shared representation
We give novel algorithms for multi-task and lifelong linear bandits with shared
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …
Optimal algorithms for latent bandits with cluster structure
We consider the problem of latent bandits with cluster structure where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped into …
users, each with an associated multi-armed bandit problem. These users are grouped into …