Online decision making with high-dimensional covariates

H Bastani, M Bayati - Operations Research, 2020 - pubsonline.informs.org
Big data have enabled decision makers to tailor decisions at the individual level in a variety
of domains, such as personalized medicine and online advertising. Doing so involves …

Beyond ucb: Optimal and efficient contextual bandits with regression oracles

D Foster, A Rakhlin - International Conference on Machine …, 2020 - proceedings.mlr.press
A fundamental challenge in contextual bandits is to develop flexible, general-purpose
algorithms with computational requirements no worse than classical supervised learning …

Balanced linear contextual bandits

M Dimakopoulou, Z Zhou, S Athey… - Proceedings of the AAAI …, 2019 - ojs.aaai.org
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as
well as the exploration method used, particularly in the presence of rich heterogeneity or …

Estimation considerations in contextual bandits

M Dimakopoulou, Z Zhou, S Athey… - arxiv preprint arxiv …, 2017 - arxiv.org
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as
well as the exploration method used, particularly in the presence of rich heterogeneity or …

Offline multi-action policy learning: Generalization and optimization

Z Zhou, S Athey, S Wager - Operations Research, 2023 - pubsonline.informs.org
In many settings, a decision maker wishes to learn a rule, or policy, that maps from
observable characteristics of an individual to an action. Examples include selecting offers …

Contextual bandits with similarity information

A Slivkins - Proceedings of the 24th annual Conference On …, 2011 - proceedings.mlr.press
In a multi-armed bandit (MAB) problem, an online algorithm makes a sequence of choices.
In each round it chooses from a time-invariant set of alternatives and receives the payoff …

From ads to interventions: Contextual bandits in mobile health

A Tewari, SA Murphy - Mobile health: sensors, analytic methods, and …, 2017 - Springer
The first paper on contextual bandits was written by Michael Woodroofe in 1979 (Journal of
the American Statistical Association, 74 (368), 799–806, 1979) but the term “contextual …

Instance-dependent complexity of contextual bandits and reinforcement learning: A disagreement-based perspective

DJ Foster, A Rakhlin, D Simchi-Levi, Y Xu - arxiv preprint arxiv …, 2020 - arxiv.org
In the classical multi-armed bandit problem, instance-dependent algorithms attain improved
performance on" easy" problems with a gap between the best and second-best arm. Are …

Contextual bandits for adapting treatment in a mouse model of de novo carcinogenesis

A Durand, C Achilleos, D Iacovides… - Machine learning …, 2018 - proceedings.mlr.press
In this work, we present a specific case study where we aim to design effective treatment
allocation strategies and validate these using a mouse model of skin cancer. Collecting data …

Distributionally robust policy evaluation and learning in offline contextual bandits

N Si, F Zhang, Z Zhou… - … Conference on Machine …, 2020 - proceedings.mlr.press
Policy learning using historical observational data is an important problem that has found
widespread applications. However, existing literature rests on the crucial assumption that …