Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Model selection in contextual stochastic bandit problems
A Pacchiano, M Phan… - Advances in …, 2020 - proceedings.neurips.cc
We study bandit model selection in stochastic environments. Our approach relies on a
master algorithm that selects between candidate base algorithms. We develop a master …
master algorithm that selects between candidate base algorithms. We develop a master …
Learning personalized decision support policies
Individual human decision-makers may benefit from different forms of support to improve
decision outcomes, but when each form of support will yield better outcomes? In this work …
decision outcomes, but when each form of support will yield better outcomes? In this work …
Tracking most significant shifts in nonparametric contextual bandits
We study nonparametric contextual bandits where Lipschitz mean reward functions may
change over time. We first establish the minimax dynamic regret rate in this less understood …
change over time. We first establish the minimax dynamic regret rate in this less understood …
Dynamic contextual pricing with doubly non-parametric random utility models
In the evolving landscape of digital commerce, adaptive dynamic pricing strategies are
essential for gaining a competitive edge. This paper introduces novel {\em doubly …
essential for gaining a competitive edge. This paper introduces novel {\em doubly …
Unifying offline causal inference and online bandit learning for data driven decision
A fundamental question for companies with large amount of logged data is: How to use such
logged data together with incoming streaming data to make good decisions? Many …
logged data together with incoming streaming data to make good decisions? Many …
The role of contextual information in best arm identification
We study the best-arm identification problem with fixed confidence when contextual
(covariate) information is available in stochastic bandits. Although we can use contextual …
(covariate) information is available in stochastic bandits. Although we can use contextual …
Adversarial rewards in universal learning for contextual bandits
We study the fundamental limits of learning in contextual bandits, where a learner's rewards
depend on their actions and a known context, which extends the canonical multi-armed …
depend on their actions and a known context, which extends the canonical multi-armed …
Adaptive algorithm for multi-armed bandit problem with high-dimensional covariates
This article studies an important sequential decision making problem known as the multi-
armed stochastic bandit problem with covariates. Under a linear bandit framework with high …
armed stochastic bandit problem with covariates. Under a linear bandit framework with high …
Thompson sampling in partially observable contextual bandits
Contextual bandits constitute a classical framework for decision-making under uncertainty.
In this setting, the goal is to learn the arms of highest reward subject to contextual …
In this setting, the goal is to learn the arms of highest reward subject to contextual …
Self-tuning bandits over unknown covariate-shifts
Bandits with covariates, aka\emph {contextual bandits}, address situations where optimal
actions (or arms) at a given time $ t $, depend on a\emph {context} $ x_t $, eg, a new …
actions (or arms) at a given time $ t $, depend on a\emph {context} $ x_t $, eg, a new …