Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Dual mirror descent for online allocation problems
We consider online allocation problems with concave revenue functions and resource
constraints, which are central problems in revenue management and online advertising. In …
constraints, which are central problems in revenue management and online advertising. In …
[書籍][B] Linear and nonlinear programming
DG Luenberger, Y Ye - 1984 - Springer
This book is intended as a text covering the central concepts of practical optimization
techniques. It is designed for either self-study by professionals or classroom work at the …
techniques. It is designed for either self-study by professionals or classroom work at the …
Bandits with knapsacks
Multi-armed bandit problems are the predominant theoretical model of exploration-
exploitation tradeoffs in learning, and they have countless applications ranging from medical …
exploitation tradeoffs in learning, and they have countless applications ranging from medical …
Online matching and ad allocation
A Mehta - … and Trends® in Theoretical Computer Science, 2013 - nowpublishers.com
Matching is a classic problem with a rich history and a significant impact, both on the theory
of algorithms and in practice. Recently there has been a surge of interest in the online …
of algorithms and in practice. Recently there has been a surge of interest in the online …
[書籍][B] Revenue management and pricing analytics
G Gallego, H Topaloglu - 2019 - Springer
Revenue management can be defined as a data-driven, computerized system to support the
tactical pricing of perishable assets at the micro-market level to maximize expected …
tactical pricing of perishable assets at the micro-market level to maximize expected …
Online task assignment in crowdsourcing markets
We explore the problem of assigning heterogeneous tasks to workers with different,
unknown skill sets in crowdsourcing markets such as Amazon Mechanical Turk. We first …
unknown skill sets in crowdsourcing markets such as Amazon Mechanical Turk. We first …
Real-time optimization of personalized assortments
Motivated by the availability of real-time data on customer characteristics, we consider the
problem of personalizing the assortment of products for each arriving customer. Using actual …
problem of personalizing the assortment of products for each arriving customer. Using actual …
Bandits with concave rewards and convex knapsacks
In this paper, we consider a very general model for exploration-exploitation tradeoff which
allows arbitrary concave rewards and convex constraints on the decisions across time, in …
allows arbitrary concave rewards and convex constraints on the decisions across time, in …
Adversarial bandits with knapsacks
We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …