- Academic Search

H Bastani, M Bayati - Operations Research, 2020 - pubsonline.informs.org

Big data have enabled decision makers to tailor decisions at the individual level in a variety
of domains, such as personalized medicine and online advertising. Doing so involves …

Save Cite Cited by 623 Related articles All 12 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] mlr.press

Beyond ucb: Optimal and efficient contextual bandits with regression oracles

D Foster, A Rakhlin - International Conference on Machine …, 2020 - proceedings.mlr.press

A fundamental challenge in contextual bandits is to develop flexible, general-purpose
algorithms with computational requirements no worse than classical supervised learning …

Save Cite Cited by 236 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aaai.org

Balanced linear contextual bandits

M Dimakopoulou, Z Zhou, S Athey… - Proceedings of the AAAI …, 2019 - ojs.aaai.org

Contextual bandit algorithms are sensitive to the estimation method of the outcome model as
well as the exploration method used, particularly in the presence of rich heterogeneity or …

Save Cite Cited by 218 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Estimation considerations in contextual bandits

M Dimakopoulou, Z Zhou, S Athey… - arxiv preprint arxiv …, 2017 - arxiv.org

Contextual bandit algorithms are sensitive to the estimation method of the outcome model as
well as the exploration method used, particularly in the presence of rich heterogeneity or …

Save Cite Cited by 243 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Offline multi-action policy learning: Generalization and optimization

Z Zhou, S Athey, S Wager - Operations Research, 2023 - pubsonline.informs.org

In many settings, a decision maker wishes to learn a rule, or policy, that maps from
observable characteristics of an individual to an action. Examples include selecting offers …

Save Cite Cited by 205 Related articles All 11 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Contextual bandits with similarity information

A Slivkins - Proceedings of the 24th annual Conference On …, 2011 - proceedings.mlr.press

In a multi-armed bandit (MAB) problem, an online algorithm makes a sequence of choices.
In each round it chooses from a time-invariant set of alternatives and receives the payoff …

Save Cite Cited by 489 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] ambujtewari.com

From ads to interventions: Contextual bandits in mobile health

A Tewari, SA Murphy - Mobile health: sensors, analytic methods, and …, 2017 - Springer

The first paper on contextual bandits was written by Michael Woodroofe in 1979 (Journal of
the American Statistical Association, 74 (368), 799–806, 1979) but the term “contextual …

Save Cite Cited by 248 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Instance-dependent complexity of contextual bandits and reinforcement learning: A disagreement-based perspective

DJ Foster, A Rakhlin, D Simchi-Levi, Y Xu - arxiv preprint arxiv …, 2020 - arxiv.org

In the classical multi-armed bandit problem, instance-dependent algorithms attain improved
performance on" easy" problems with a gap between the best and second-best arm. Are …

Save Cite Cited by 101 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Contextual bandits for adapting treatment in a mouse model of de novo carcinogenesis

A Durand, C Achilleos, D Iacovides… - Machine learning …, 2018 - proceedings.mlr.press

In this work, we present a specific case study where we aim to design effective treatment
allocation strategies and validate these using a mouse model of skin cancer. Collecting data …

Save Cite Cited by 135 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Distributionally robust policy evaluation and learning in offline contextual bandits

N Si, F Zhang, Z Zhou… - … Conference on Machine …, 2020 - proceedings.mlr.press

Policy learning using historical observational data is an important problem that has found
widespread applications. However, existing literature rests on the crucial assumption that …

Save Cite Cited by 66 Related articles All 7 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Nonparametric bandits with covariates

Online decision making with high-dimensional covariates

Beyond ucb: Optimal and efficient contextual bandits with regression oracles

Balanced linear contextual bandits

Estimation considerations in contextual bandits

Offline multi-action policy learning: Generalization and optimization

Contextual bandits with similarity information

From ads to interventions: Contextual bandits in mobile health

Instance-dependent complexity of contextual bandits and reinforcement learning: A disagreement-based perspective

Contextual bandits for adapting treatment in a mouse model of de novo carcinogenesis

Distributionally robust policy evaluation and learning in offline contextual bandits