Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D Ding, CY Wei, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …

The complexity of markov equilibrium in stochastic games

C Daskalakis, N Golowich… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
We show that computing approximate stationary Markov coarse correlated equilibria (CCE)
in general-sum stochastic games is PPAD-hard, even when there are two players, the game …

On improving model-free algorithms for decentralized multi-agent reinforcement learning

W Mao, L Yang, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …

Provably fast convergence of independent natural policy gradient for markov potential games

Y Sun, T Liu, R Zhou, PR Kumar… - Advances in Neural …, 2023 - proceedings.neurips.cc
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent
reinforcement learning problem in Markov potential games. It is shown that, under mild …

Zero-sum polymatrix markov games: Equilibrium collapse and efficient computation of nash equilibria

F Kalogiannis, I Panageas - Advances in Neural …, 2024 - proceedings.neurips.cc
The works of (Daskalakis et al., 2009, 2022; ** et al., 2022; Deng et al., 2023) indicate that
computing Nash equilibria in multi-player Markov games is a computationally hard task. This …

Gradient play in stochastic games: stationary points, convergence, and sample complexity

R Zhang, Z Ren, N Li - IEEE Transactions on Automatic Control, 2024 - ieeexplore.ieee.org
We study the performance of the gradient play algorithm for stochastic games (SGs), where
each agent tries to maximize its own total discounted reward by making decisions …

On the global convergence rates of decentralized softmax gradient play in markov potential games

R Zhang, J Mei, B Dai… - Advances in Neural …, 2022 - proceedings.neurips.cc
Softmax policy gradient is a popular algorithm for policy optimization in single-agent
reinforcement learning, particularly since projection is not needed for each gradient update …

Learning in congestion games with bandit feedback

Q Cui, Z **ong, M Fazel, SS Du - Advances in Neural …, 2022 - proceedings.neurips.cc
In this paper, we investigate Nash-regret minimization in congestion games, a class of
games with benign theoretical structure and broad real-world applications. We first propose …

Escher: Eschewing importance sampling in games by computing a history value function to estimate regret

S McAleer, G Farina, M Lanctot, T Sandholm - arxiv preprint arxiv …, 2022 - arxiv.org
Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …

The complexity of infinite-horizon general-sum stochastic games

Y **, V Muthukumar, A Sidford - arxiv preprint arxiv:2204.04186, 2022 - arxiv.org
We study the complexity of computing stationary Nash equilibrium (NE) in n-player infinite-
horizon general-sum stochastic games. We focus on the problem of computing NE in such …