- Academic Search

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Enregistrer Citer Cité 1259 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Recherche dans les bibliothèques Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Competitive caching with machine learned advice

T Lykouris, S Vassilvitskii - Journal of the ACM (JACM), 2021 - dl.acm.org

Traditional online algorithms encapsulate decision making under uncertainty, and give ways
to hedge against all possible future events, while guaranteeing a nearly optimal solution, as …

Enregistrer Citer Cité 454 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Dual mirror descent for online allocation problems

S Balseiro, H Lu, V Mirrokni - International Conference on …, 2020 - proceedings.mlr.press

We consider online allocation problems with concave revenue functions and resource
constraints, which are central problems in revenue management and online advertising. In …

Enregistrer Citer Cité 157 fois Autres articles Les 10 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2023 - proceedings.neurips.cc

We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

Enregistrer Citer Cité 19 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Adapting to misspecification in contextual bandits

DJ Foster, C Gentile, M Mohri… - Advances in Neural …, 2020 - proceedings.neurips.cc

A major research direction in contextual bandits is to develop algorithms that are
computationally efficient, yet support flexible, general-purpose function approximation …

Enregistrer Citer Cité 116 fois Autres articles Les 9 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Better algorithms for stochastic bandits with adversarial corruptions

A Gupta, T Koren, K Talwar - Conference on Learning …, 2019 - proceedings.mlr.press

We study the stochastic multi-armed bandits problem in the presence of adversarial
corruption. We present a new algorithm for this problem whose regret is nearly optimal …

Enregistrer Citer Cité 189 fois Autres articles Les 5 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

J He, D Zhou, T Zhang, Q Gu - Advances in neural …, 2022 - proceedings.neurips.cc

We study the linear contextual bandit problem in the presence of adversarial corruption,
where the reward at each round is corrupted by an adversary, and the corruption level (ie …

Enregistrer Citer Cité 56 fois Autres articles Les 8 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] ssrn.com

Feature-based dynamic pricing

MC Cohen, I Lobel, R Paes Leme - Management Science, 2020 - pubsonline.informs.org

We consider the problem faced by a firm that receives highly differentiated products in an
online fashion. The firm needs to price these products to sell them to its customer base …

Enregistrer Citer Cité 248 fois Autres articles Les 17 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

C Ye, W **ong, Q Gu, T Zhang - International Conference on …, 2023 - proceedings.mlr.press

Despite the significant interest and progress in reinforcement learning (RL) problems with
adversarial corruption, current works are either confined to the linear setting or lead to an …

Enregistrer Citer Cité 30 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Corruption-robust exploration in episodic reinforcement learning

T Lykouris, M Simchowitz… - … on Learning Theory, 2021 - proceedings.mlr.press

We initiate the study of episodic reinforcement learning under adversarial corruptions in both
the rewards and the transition probabilities of the underlying system extending recent results …

Enregistrer Citer Cité 132 fois Autres articles Les 5 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Stochastic bandits robust to adversarial corruptions

Introduction to multi-armed bandits

Competitive caching with machine learned advice

Dual mirror descent for online allocation problems

Corruption-robust offline reinforcement learning with general function approximation

Adapting to misspecification in contextual bandits

Better algorithms for stochastic bandits with adversarial corruptions

Nearly optimal algorithms for linear contextual bandits with adversarial corruptions

Feature-based dynamic pricing

Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes

Corruption-robust exploration in episodic reinforcement learning