Breaking the curse of multiagency: Provably efficient decentralized multi-agent rl with function approximation

Y Wang, Q Liu, Y Bai, C ** - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the\emph {curse of
multiagency}, where the description length of the game as well as the complexity of many …

Policy space diversity for non-transitive games

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

Sample-efficient reinforcement learning of partially observable markov games

Q Liu, C Szepesvári, C ** - Advances in Neural …, 2022 - proceedings.neurips.cc
This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL)
under partial observability, where each agent only sees her own individual observations and …

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

Efficient Phi-regret minimization in extensive-form games via online mirror descent

Y Bai, C **, S Mei, Z Song… - Advances in Neural …, 2022 - proceedings.neurips.cc
A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert
them to Normal-Form Games (NFGs). This approach enables us to directly translate state-of …

Partially observable rl with b-stability: Unified structural condition and sharp sample-efficient algorithms

F Chen, Y Bai, S Mei - arxiv preprint arxiv:2209.14990, 2022 - arxiv.org
Partial Observability--where agents can only observe partial information about the true
underlying state of the system--is ubiquitous in real-world applications of Reinforcement …

Near-Optimal -Regret Learning in Extensive-Form Games

I Anagnostides, G Farina… - … Conference on Machine …, 2023 - proceedings.mlr.press
In this paper, we establish efficient and uncoupled learning dynamics so that, when
employed by all players in multiplayer perfect-recall imperfect-information extensive-form …

Sample-efficient learning of correlated equilibria in extensive-form games

Z Song, S Mei, Y Bai - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract Imperfect-Information Extensive-Form Games (IIEFGs) is a prevalent model for real-
world games involving imperfect information and sequential plays. The Extensive-Form …

An efficient deep reinforcement learning algorithm for solving imperfect information extensive-form games

L Meng, Z Ge, P Tian, B An, Y Gao - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
One of the most popular methods for learning Nash equilibrium (NE) in large-scale imperfect
information extensive-form games (IIEFGs) is the neural variants of counterfactual regret …

Adapting to game trees in zero-sum imperfect information games

C Fiegel, P Ménard, T Kozuno… - International …, 2023 - proceedings.mlr.press
Imperfect information games (IIG) are games in which each player only partially observes
the current game state. We study how to learn $\epsilon $-optimal strategies in a zero-sum …