Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

A theoretical analysis of deep Q-learning

J Fan, Z Wang, Y **e, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press
Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …

Bridging offline reinforcement learning and imitation learning: A tale of pessimism

P Rashidinejad, B Zhu, C Ma, J Jiao… - Advances in Neural …, 2021 - proceedings.neurips.cc
Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from
a fixed dataset without active data collection. Based on the composition of the offline dataset …

Single and multi-agent deep reinforcement learning for AI-enabled wireless networks: A tutorial

A Feriani, E Hossain - IEEE Communications Surveys & …, 2021 - ieeexplore.ieee.org
Deep Reinforcement Learning (DRL) has recently witnessed significant advances that have
led to multiple successes in solving sequential decision-making problems in various …

On the theory of policy gradient methods: Optimality, approximation, and distribution shift

A Agarwal, SM Kakade, JD Lee, G Mahajan - Journal of Machine Learning …, 2021 - jmlr.org
Policy gradient methods are among the most effective methods in challenging reinforcement
learning problems with large state and/or action spaces. However, little is known about even …

A review of cooperative multi-agent deep reinforcement learning

A Oroojlooy, D Ha**ezhad - Applied Intelligence, 2023 - Springer
Abstract Deep Reinforcement Learning has made significant progress in multi-agent
systems in recent years. The aim of this review article is to provide an overview of recent …

Optimality and approximation with policy gradient methods in markov decision processes

A Agarwal, SM Kakade, JD Lee… - … on Learning Theory, 2020 - proceedings.mlr.press
Policy gradient (PG) methods are among the most effective methods in challenging
reinforcement learning problems with large state and/or action spaces. However, little is …

Natural policy gradient primal-dual method for constrained markov decision processes

D Ding, K Zhang, T Basar… - Advances in Neural …, 2020 - proceedings.neurips.cc
We study sequential decision-making problems in which each agent aims to maximize the
expected total reward while satisfying a constraint on the expected total utility. We employ …