Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
Reinforcement learning for selective key applications in power systems: Recent advances and future challenges
With large-scale integration of renewable generation and distributed energy resources,
modern power systems are confronted with new operational challenges, such as growing …
modern power systems are confronted with new operational challenges, such as growing …
[BOG][B] Control systems and reinforcement learning
S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
Online robust reinforcement learning with model uncertainty
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
Federated reinforcement learning: Linear speedup under markovian sampling
Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling
observations from the environment is usually split across multiple agents. However …
observations from the environment is usually split across multiple agents. However …
Crpo: A new approach for safe reinforcement learning with convergence guarantee
In safe reinforcement learning (SRL) problems, an agent explores the environment to
maximize an expected total reward and meanwhile avoids violation of certain constraints on …
maximize an expected total reward and meanwhile avoids violation of certain constraints on …
Finite-sample analysis for sarsa with linear function approximation
SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement
learning. We investigate the SARSA algorithm with linear function approximation under the …
learning. We investigate the SARSA algorithm with linear function approximation under the …
A finite-time analysis of two time-scale actor-critic methods
Actor-critic (AC) methods have exhibited great empirical success compared with other
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …
Breaking the sample size barrier in model-based reinforcement learning with a generative model
We investigate the sample efficiency of reinforcement learning in a $\gamma $-discounted
infinite-horizon Markov decision process (MDP) with state space S and action space A …
infinite-horizon Markov decision process (MDP) with state space S and action space A …
Is Q-learning minimax optimal? a tight sample complexity analysis
Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP)
in a model-free fashion, lies at the heart of reinforcement learning. When it comes to the …
in a model-free fashion, lies at the heart of reinforcement learning. When it comes to the …