Discovered policy optimisation
Tremendous progress has been made in reinforcement learning (RL) over the past decade.
Most of these advancements came through the continual development of new algorithms …
Most of these advancements came through the continual development of new algorithms …
Linear convergence of natural policy gradient methods with log-linear policies
We consider infinite-horizon discounted Markov decision processes and study the
convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log …
convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log …
Meta-Black-Box optimization for evolutionary algorithms: Review and perspective
Abstract Black-Box Optimization (BBO) is increasingly vital for addressing complex real-
world optimization challenges, where traditional methods fall short due to their reliance on …
world optimization challenges, where traditional methods fall short due to their reliance on …
A novel framework for policy mirror descent with general parameterization and linear convergence
Modern policy optimization methods in reinforcement learning, such as TRPO and PPO, owe
their success to the use of parameterized policies. However, while theoretical guarantees …
their success to the use of parameterized policies. However, while theoretical guarantees …
[PDF][PDF] Heterogeneous-agent reinforcement learning
The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in AI research. However, many research endeavours …
agent reinforcement learning (MARL) in AI research. However, many research endeavours …
Proximal learning with opponent-learning awareness
Abstract Learning With Opponent-Learning Awareness (LOLA)(Foerster et al.[2018a]) is a
multi-agent reinforcement learning algorithm that typically learns reciprocity-based …
multi-agent reinforcement learning algorithm that typically learns reciprocity-based …
Discovering temporally-aware reinforcement learning algorithms
Recent advancements in meta-learning have enabled the automatic discovery of novel
reinforcement learning algorithms parameterized by surrogate objective functions. To …
reinforcement learning algorithms parameterized by surrogate objective functions. To …
Heterogeneous-agent mirror learning: A continuum of solutions to cooperative marl
The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in the artificial intelligence (AI) research community …
agent reinforcement learning (MARL) in the artificial intelligence (AI) research community …
Mutual-Information Regularized Multi-Agent Policy Iteration
Despite the success of cooperative multi-agent reinforcement learning algorithms, most of
them focus on a single team composition, which prevents them from being used in more …
them focus on a single team composition, which prevents them from being used in more …