- Academic Search

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Tallenna Viittaa Viittausten määrä 3323 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota Kirjastohaku

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Tallenna Viittaa Viittausten määrä 1259 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota Kirjastohaku HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Contextual decision processes with low bellman rank are pac-learnable

N Jiang, A Krishnamurthy, A Agarwal… - International …, 2017 - proceedings.mlr.press

This paper studies systematic exploration for reinforcement learning (RL) with rich
observations and function approximation. We introduce contextual decision processes …

Tallenna Viittaa Viittausten määrä 514 Aiheeseen liittyviä artikkeleita Kaikki 11 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Beyond ucb: Optimal and efficient contextual bandits with regression oracles

D Foster, A Rakhlin - International conference on machine …, 2020 - proceedings.mlr.press

A fundamental challenge in contextual bandits is to develop flexible, general-purpose
algorithms with computational requirements no worse than classical supervised learning …

Tallenna Viittaa Viittausten määrä 240 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[KIRJA][B] Mathematical analysis of machine learning algorithms

T Zhang - 2023 - books.google.com

The mathematical theory of machine learning not only explains the current algorithms but
can also motivate principled approaches for the future. This self-contained textbook …

Tallenna Viittaa Viittausten määrä 65 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota Kirjastohaku

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

S Bubeck, N Cesa-Bianchi - Foundations and Trends® in …, 2012 - nowpublishers.com

Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …

Tallenna Viittaa Viittausten määrä 3293 Aiheeseen liittyviä artikkeleita Kaikki 26 versiota Kirjastohaku HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Taming the monster: A fast and simple algorithm for contextual bandits

A Agarwal, D Hsu, S Kale, J Langford… - International …, 2014 - proceedings.mlr.press

We present a new algorithm for the contextual bandit learning problem, where the learner
repeatedly takes one of K\emphactions in response to the observed\emphcontext, and …

Tallenna Viittaa Viittausten määrä 608 Aiheeseen liittyviä artikkeleita Kaikki 19 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Bandits with knapsacks

A Badanidiyuru, R Kleinberg, A Slivkins - Journal of the ACM (JACM), 2018 - dl.acm.org

Multi-armed bandit problems are the predominant theoretical model of exploration-
exploitation tradeoffs in learning, and they have countless applications ranging from medical …

Tallenna Viittaa Viittausten määrä 537 Aiheeseen liittyviä artikkeleita Kaikki 10 versiota

[Free GPT-4]
[DeepSeek]

[PDF] ox.ac.uk

Adaptive treatment assignment in experiments for policy choice

M Kasy, A Sautmann - Econometrica, 2021 - Wiley Online Library

Standard experimental designs are geared toward point estimation and hypothesis testing,
while bandit algorithms are geared toward in‐sample outcomes. Here, we instead consider …

Tallenna Viittaa Viittausten määrä 182 Aiheeseen liittyviä artikkeleita Kaikki 20 versiota Kirjastohaku

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

[PDF][PDF] Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization.

T Desautels, A Krause, JW Burdick - J. Mach. Learn. Res., 2014 - jmlr.org

How can we take advantage of opportunities for experimental parallelization in
explorationexploitation tradeoffs? In many experimental scenarios, it is often desirable to …

Tallenna Viittaa Viittausten määrä 495 Aiheeseen liittyviä artikkeleita Kaikki 22 versiota HTML-versio

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Efficient optimal learning for contextual bandits

[KIRJA][B] Bandit algorithms

Introduction to multi-armed bandits

Contextual decision processes with low bellman rank are pac-learnable

Beyond ucb: Optimal and efficient contextual bandits with regression oracles

[KIRJA][B] Mathematical analysis of machine learning algorithms

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

Taming the monster: A fast and simple algorithm for contextual bandits

Bandits with knapsacks

Adaptive treatment assignment in experiments for policy choice

[PDF][PDF] Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization.