When can we learn general-sum Markov games with a large number of players sample-efficiently?

Z Song, S Mei, Y Bai - arxiv preprint arxiv:2110.04184, 2021 - arxiv.org
Multi-agent reinforcement learning has made substantial empirical progresses in solving
games with a large number of players. However, theoretically, the best known sample …

Breaking the curse of multiagency: Provably efficient decentralized multi-agent rl with function approximation

Y Wang, Q Liu, Y Bai, C ** - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the\emph {curse of
multiagency}, where the description length of the game as well as the complexity of many …

Online learning in stackelberg games with an omniscient follower

G Zhao, B Zhu, J Jiao, M Jordan - … Conference on Machine …, 2023 - proceedings.mlr.press
We study the problem of online learning in a two-player decentralized cooperative
Stackelberg game. In each round, the leader first takes an action, followed by the follower …

The sample complexity of online contract design

B Zhu, S Bates, Z Yang, Y Wang, J Jiao… - arxiv preprint arxiv …, 2022 - arxiv.org
We study the hidden-action principal-agent problem in an online setting. In each round, the
principal posts a contract that specifies the payment to the agent based on each outcome …

Towards general function approximation in zero-sum markov games

B Huang, JD Lee, Z Wang, Z Yang - arxiv preprint arxiv:2107.14702, 2021 - arxiv.org
This paper considers two-player zero-sum finite-horizon Markov games with simultaneous
moves. The study focuses on the challenging settings where the value function or the model …

Welfare maximization in competitive equilibrium: Reinforcement learning for markov exchange economy

Z Liu, M Lu, Z Wang, M Jordan… - … Conference on Machine …, 2022 - proceedings.mlr.press
We study a bilevel economic system, which we refer to as a Markov exchange economy
(MEE), from the point of view of multi-agent reinforcement learning (MARL). An MEE …

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

Parl: A unified framework for policy alignment in reinforcement learning from human feedback

S Chakraborty, A Bedi, A Koppel, H Wang… - The Twelfth …, 2024 - openreview.net
We present a novel unified bilevel optimization-based framework,\textsf {PARL}, formulated
to address the recently highlighted critical issue of policy alignment in reinforcement …

Can reinforcement learning find stackelberg-nash equilibria in general-sum markov games with myopic followers?

H Zhong, Z Yang, Z Wang, MI Jordan - arxiv preprint arxiv:2112.13521, 2021 - arxiv.org
We study multi-player general-sum Markov games with one of the players designated as the
leader and the other players regarded as followers. In particular, we focus on the class of …

Computing optimal equilibria and mechanisms via learning in zero-sum extensive-form games

B Zhang, G Farina, I Anagnostides… - Advances in …, 2024 - proceedings.neurips.cc
We introduce a new approach for computing optimal equilibria via learning in games. It
applies to extensive-form settings with any number of players, including mechanism design …