Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

[LLIBRE][B] Partially observed Markov decision processes

V Krishnamurthy - 2016 - books.google.com
Covering formulation, algorithms, and structural results, and linking theory to real-world
applications in controlled sensing (including social learning, adaptive radars and sequential …

Correlated equilibrium as an expression of Bayesian rationality

RJ Aumann - Econometrica: Journal of the Econometric Society, 1987 - JSTOR
Correlated equilibrium is formulated in a manner that does away with the dichotomy usually
perceived between the" Bayesian" and the" game-theoretic" view of the world. From the …

Learning zero-sum simultaneous-move markov games using function approximation and correlated equilibrium

Q **e, Y Chen, Z Wang, Z Yang - Conference on learning …, 2020 - proceedings.mlr.press
In this work, we develop provably efficient reinforcement learning algorithms for two-player
zero-sum Markov games with simultaneous moves. We consider a family of Markov games …

Provably efficient reinforcement learning in decentralized general-sum markov games

W Mao, T Başar - Dynamic Games and Applications, 2023 - Springer
This paper addresses the problem of learning an equilibrium efficiently in general-sum
Markov games through decentralized multi-agent reinforcement learning. Given the …

[LLIBRE][B] Stability and perfection of Nash equilibria

E Van Damme - 1991 - Springer
The last decade has seen a steady increase in the application of concepts from
noncooperative game theory to such diverse fields as economics, political science, law …

Intrinsic robustness of the price of anarchy

T Roughgarden - Journal of the ACM (JACM), 2015 - dl.acm.org
The price of anarchy, defined as the ratio of the worst-case objective function value of a
Nash equilibrium of a game and that of an optimal outcome, quantifies the inefficiency of …

Game theory models for communication between agents: a review

AD Farooqui, MA Niazi - Complex Adaptive Systems Modeling, 2016 - Springer
In the real world, agents or entities are in a continuous state of interactions. These
interactions lead to various types of complexity dynamics. One key difficulty in the study of …

Tight last-iterate convergence rates for no-regret learning in multi-player games

N Golowich, S Pattathil… - Advances in neural …, 2020 - proceedings.neurips.cc
We study the question of obtaining last-iterate convergence rates for no-regret learning
algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a …

Stackelberg vs. Nash in security games: An extended investigation of interchangeability, equivalence, and uniqueness

D Korzhyk, Z Yin, C Kiekintveld, V Conitzer… - Journal of Artificial …, 2011 - jair.org
There has been significant recent interest in game-theoretic approaches to security, with
much of the recent research focused on utilizing the leader-follower Stackelberg game …