Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Natural policy gradient primal-dual method for constrained markov decision processes

D Ding, K Zhang, T Basar… - Advances in Neural …, 2020 - proceedings.neurips.cc
We study sequential decision-making problems in which each agent aims to maximize the
expected total reward while satisfying a constraint on the expected total utility. We employ …

Independent policy gradient methods for competitive reinforcement learning

C Daskalakis, DJ Foster… - Advances in neural …, 2020 - proceedings.neurips.cc
We obtain global, non-asymptotic convergence guarantees for independent learning
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …

Global convergence of policy gradient methods to (almost) locally optimal policies

K Zhang, A Koppel, H Zhu, T Basar - SIAM Journal on Control and …, 2020 - SIAM
Policy gradient (PG) methods have been one of the most essential ingredients of
reinforcement learning, with application in a variety of domains. In spite of the empirical …

Decentralized Q-learning in zero-sum Markov games

M Sayin, K Zhang, D Leslie, T Basar… - Advances in Neural …, 2021 - proceedings.neurips.cc
We study multi-agent reinforcement learning (MARL) in infinite-horizon discounted zero-sum
Markov games. We focus on the practical but challenging setting of decentralized MARL …

Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D Ding, CY Wei, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …

Model-based multi-agent rl in zero-sum markov games with near-optimal sample complexity

K Zhang, SM Kakade, T Basar, LF Yang - Journal of Machine Learning …, 2023 - jmlr.org
Model-based reinforcement learning (RL), which finds an optimal policy after establishing an
empirical model, has long been recognized as one of the cornerstones of RL. It is especially …

Do GANs always have Nash equilibria?

F Farnia, A Ozdaglar - International Conference on Machine …, 2020 - proceedings.mlr.press
Generative adversarial networks (GANs) represent a zero-sum game between two machine
players, a generator and a discriminator, designed to learn the distribution of data. While …