Google Академія

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Зберегти Послатися Цитовано в 1715 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] annualreviews.org

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

Зберегти Послатися Цитовано в 90 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Зберегти Послатися Цитовано в 352 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Natural policy gradient primal-dual method for constrained markov decision processes

D Ding, K Zhang, T Basar… - Advances in Neural …, 2020 - proceedings.neurips.cc

We study sequential decision-making problems in which each agent aims to maximize the
expected total reward while satisfying a constraint on the expected total utility. We employ …

Зберегти Послатися Цитовано в 223 джерелах Пов’язані статті Кількість версій: 9 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Independent policy gradient methods for competitive reinforcement learning

C Daskalakis, DJ Foster… - Advances in neural …, 2020 - proceedings.neurips.cc

We obtain global, non-asymptotic convergence guarantees for independent learning
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …

Зберегти Послатися Цитовано в 207 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Global convergence of policy gradient methods to (almost) locally optimal policies

K Zhang, A Koppel, H Zhu, T Basar - SIAM Journal on Control and …, 2020 - SIAM

Policy gradient (PG) methods have been one of the most essential ingredients of
reinforcement learning, with application in a variety of domains. In spite of the empirical …

Зберегти Послатися Цитовано в 231 джерелах Пов’язані статті Кількість версій: 10

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Decentralized Q-learning in zero-sum Markov games

M Sayin, K Zhang, D Leslie, T Basar… - Advances in Neural …, 2021 - proceedings.neurips.cc

We study multi-agent reinforcement learning (MARL) in infinite-horizon discounted zero-sum
Markov games. We focus on the practical but challenging setting of decentralized MARL …

Зберегти Послатися Цитовано в 113 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D Ding, CY Wei, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press

We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …

Зберегти Послатися Цитовано в 87 джерелах Пов’язані статті Кількість версій: 9 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Model-based multi-agent rl in zero-sum markov games with near-optimal sample complexity

K Zhang, SM Kakade, T Basar, LF Yang - Journal of Machine Learning …, 2023 - jmlr.org

Model-based reinforcement learning (RL), which finds an optimal policy after establishing an
empirical model, has long been recognized as one of the cornerstones of RL. It is especially …

Зберегти Послатися Цитовано в 159 джерелах Пов’язані статті Кількість версій: 15 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Do GANs always have Nash equilibria?

F Farnia, A Ozdaglar - International Conference on Machine …, 2020 - proceedings.mlr.press

Generative adversarial networks (GANs) represent a zero-sum game between two machine
players, a generator and a discriminator, designed to learn the distribution of data. While …

Зберегти Послатися Цитовано в 128 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games

Multi-agent reinforcement learning: A selective overview of theories and algorithms

Toward a theoretical foundation of policy optimization for learning control policies

An overview of multi-agent reinforcement learning from game theoretical perspective

Natural policy gradient primal-dual method for constrained markov decision processes

Independent policy gradient methods for competitive reinforcement learning

Global convergence of policy gradient methods to (almost) locally optimal policies

Decentralized Q-learning in zero-sum Markov games

Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence

Model-based multi-agent rl in zero-sum markov games with near-optimal sample complexity

Do GANs always have Nash equilibria?