Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence
We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …
The complexity of markov equilibrium in stochastic games
We show that computing approximate stationary Markov coarse correlated equilibria (CCE)
in general-sum stochastic games is PPAD-hard, even when there are two players, the game …
in general-sum stochastic games is PPAD-hard, even when there are two players, the game …
On improving model-free algorithms for decentralized multi-agent reinforcement learning
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …
sample complexity dependence on the number of agents, a phenomenon known as the …
Provably fast convergence of independent natural policy gradient for markov potential games
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent
reinforcement learning problem in Markov potential games. It is shown that, under mild …
reinforcement learning problem in Markov potential games. It is shown that, under mild …
Zero-sum polymatrix markov games: Equilibrium collapse and efficient computation of nash equilibria
The works of (Daskalakis et al., 2009, 2022; ** et al., 2022; Deng et al., 2023) indicate that
computing Nash equilibria in multi-player Markov games is a computationally hard task. This …
computing Nash equilibria in multi-player Markov games is a computationally hard task. This …
Gradient play in stochastic games: stationary points, convergence, and sample complexity
We study the performance of the gradient play algorithm for stochastic games (SGs), where
each agent tries to maximize its own total discounted reward by making decisions …
each agent tries to maximize its own total discounted reward by making decisions …
On the global convergence rates of decentralized softmax gradient play in markov potential games
Softmax policy gradient is a popular algorithm for policy optimization in single-agent
reinforcement learning, particularly since projection is not needed for each gradient update …
reinforcement learning, particularly since projection is not needed for each gradient update …
Learning in congestion games with bandit feedback
In this paper, we investigate Nash-regret minimization in congestion games, a class of
games with benign theoretical structure and broad real-world applications. We first propose …
games with benign theoretical structure and broad real-world applications. We first propose …
Escher: Eschewing importance sampling in games by computing a history value function to estimate regret
Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …
networks to learn approximately optimal policies (strategies). One promising line of research …
The complexity of infinite-horizon general-sum stochastic games
We study the complexity of computing stationary Nash equilibrium (NE) in n-player infinite-
horizon general-sum stochastic games. We focus on the problem of computing NE in such …
horizon general-sum stochastic games. We focus on the problem of computing NE in such …