Evolutionary reinforcement learning: A survey

H Bai, R Cheng, Y ** - Intelligent Computing, 2023 - spj.science.org
Reinforcement learning (RL) is a machine learning approach that trains agents to maximize
cumulative rewards through interactions with environments. The integration of RL with deep …

Derivative-free reinforcement learning: A review

H Qian, Y Yu - Frontiers of Computer Science, 2021 - Springer
Reinforcement learning is about learning agent models that make the best sequential
decisions in unknown environments. In an unknown environment, the agent needs to …

A theoretical and empirical comparison of gradient approximations in derivative-free optimization

AS Berahas, L Cao, K Choromanski… - Foundations of …, 2022 - Springer
In this paper, we analyze several methods for approximating gradients of noisy functions
using only function values. These methods include finite differences, linear interpolation …

Effective diversity in population based reinforcement learning

J Parker-Holder, A Pacchiano… - Advances in …, 2020 - proceedings.neurips.cc
Exploration is a key problem in reinforcement learning, since agents can only learn from
data they acquire in the environment. With that in mind, maintaining a population of agents is …

i-sim2real: Reinforcement learning of robotic policies in tight human-robot interaction loops

SW Abeyruwan, L Graesser… - … on Robot Learning, 2023 - proceedings.mlr.press
Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to
train policies in simulation enables safe exploration and large-scale data collection quickly …

Observational overfitting in reinforcement learning

X Song, Y Jiang, S Tu, Y Du, B Neyshabur - arxiv preprint arxiv …, 2019 - arxiv.org
A major component of overfitting in model-free reinforcement learning (RL) involves the case
where the agent may mistakenly correlate reward with certain spurious features from the …

Sample-efficient cross-entropy method for real-time planning

C Pinneri, S Sawant, S Blaes… - … on Robot Learning, 2021 - proceedings.mlr.press
Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy
Method (CEM), can yield compelling results even in high-dimensional control tasks and …

Es-maml: Simple hessian-free meta learning

X Song, W Gao, Y Yang, K Choromanski… - arxiv preprint arxiv …, 2019 - arxiv.org
We introduce ES-MAML, a new framework for solving the model agnostic meta learning
(MAML) problem based on Evolution Strategies (ES). Existing algorithms for MAML are …

Deep reinforcement learning versus evolution strategies: A comparative survey

AY Majid, S Saaybi, V Francois-Lavet… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Deep reinforcement learning (DRL) and evolution strategies (ESs) have surpassed human-
level control in many sequential decision-making problems, yet many open challenges still …

Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points

K Balasubramanian, S Ghadimi - Foundations of Computational …, 2022 - Springer
In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for
nonconvex and convex optimization, with a focus on addressing constrained optimization …