Google 학술 검색

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

저장 인용 3313회 인용 관련 학술자료 전체 9개의 버전 도서관 검색

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of learning in multiagent environments: Dealing with non-stationarity

P Hernandez-Leal, M Kaisers, T Baarslag… - arxiv preprint arxiv …, 2017 - arxiv.org

The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

저장 인용 370회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Bandits with knapsacks

A Badanidiyuru, R Kleinberg, A Slivkins - Journal of the ACM (JACM), 2018 - dl.acm.org

Multi-armed bandit problems are the predominant theoretical model of exploration-
exploitation tradeoffs in learning, and they have countless applications ranging from medical …

저장 인용 536회 인용 관련 학술자료 전체 11개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Truthful incentives in crowdsourcing tasks using regret minimization mechanisms

A Singla, A Krause - Proceedings of the 22nd international conference …, 2013 - dl.acm.org

What price should be offered to a worker for a task in an online labor market? How can one
enable workers to express the amount they desire to receive for the task completion …

저장 인용 350회 인용 관련 학술자료 전체 8개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bandits with concave rewards and convex knapsacks

S Agrawal, NR Devanur - Proceedings of the fifteenth ACM conference …, 2014 - dl.acm.org

In this paper, we consider a very general model for exploration-exploitation tradeoff which
allows arbitrary concave rewards and convex constraints on the decisions across time, in …

저장 인용 245회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adversarial bandits with knapsacks

N Immorlica, K Sankararaman, R Schapire… - Journal of the ACM, 2022 - dl.acm.org

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …

저장 인용 134회 인용 관련 학술자료 전체 12개의 버전

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Efficient crowdsourcing of unknown experts using bounded multi-armed bandits

L Tran-Thanh, S Stein, A Rogers, NR Jennings - Artificial Intelligence, 2014 - Elsevier

Increasingly, organisations flexibly outsource work on a temporary basis to a global
audience of workers. This so-called crowdsourcing has been applied successfully to a range …

저장 인용 263회 인용 관련 학술자료 전체 21개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Linear contextual bandits with knapsacks

S Agrawal, N Devanur - Advances in neural information …, 2016 - proceedings.neurips.cc

We consider the linear contextual bandit problem with resource consumption, in addition to
reward generation. In each round, the outcome of pulling an arm is a reward as well as a …

저장 인용 179회 인용 관련 학술자료 전체 8개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Budget-constrained multi-armed bandits with multiple plays

D Zhou, C Tomlin - Proceedings of the AAAI Conference on Artificial …, 2018 - ojs.aaai.org

We study the multi-armed bandit problem with multiple plays and a budget constraint for
both the stochastic and the adversarial setting. At each round, exactly K out of N possible …

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Knapsack based optimal policies for budget–limited multi–armed bandits

[책][B] Bandit algorithms

Introduction to multi-armed bandits

A survey of learning in multiagent environments: Dealing with non-stationarity

Bandits with knapsacks

Truthful incentives in crowdsourcing tasks using regret minimization mechanisms

Bandits with concave rewards and convex knapsacks

Adversarial bandits with knapsacks

[HTML][HTML] Efficient crowdsourcing of unknown experts using bounded multi-armed bandits

Linear contextual bandits with knapsacks

Budget-constrained multi-armed bandits with multiple plays