Google Acadèmic

H Bai, R Cheng, Y ** - Intelligent Computing, 2023 - spj.science.org

Reinforcement learning (RL) is a machine learning approach that trains agents to maximize
cumulative rewards through interactions with environments. The integration of RL with deep …

Desa Cita Citat per 57 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Derivative-free reinforcement learning: A review

H Qian, Y Yu - Frontiers of Computer Science, 2021 - Springer

Reinforcement learning is about learning agent models that make the best sequential
decisions in unknown environments. In an unknown environment, the agent needs to …

Desa Cita Citat per 47 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A theoretical and empirical comparison of gradient approximations in derivative-free optimization

AS Berahas, L Cao, K Choromanski… - Foundations of …, 2022 - Springer

In this paper, we analyze several methods for approximating gradients of noisy functions
using only function values. These methods include finite differences, linear interpolation …

Desa Cita Citat per 201 Articles relacionats Totes les 10 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Effective diversity in population based reinforcement learning

J Parker-Holder, A Pacchiano… - Advances in …, 2020 - proceedings.neurips.cc

Exploration is a key problem in reinforcement learning, since agents can only learn from
data they acquire in the environment. With that in mind, maintaining a population of agents is …

Desa Cita Citat per 173 Articles relacionats Totes les 9 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

i-sim2real: Reinforcement learning of robotic policies in tight human-robot interaction loops

SW Abeyruwan, L Graesser… - … on Robot Learning, 2023 - proceedings.mlr.press

Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to
train policies in simulation enables safe exploration and large-scale data collection quickly …

Desa Cita Citat per 56 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Observational overfitting in reinforcement learning

X Song, Y Jiang, S Tu, Y Du, B Neyshabur - arxiv preprint arxiv …, 2019 - arxiv.org

A major component of overfitting in model-free reinforcement learning (RL) involves the case
where the agent may mistakenly correlate reward with certain spurious features from the …

Desa Cita Citat per 159 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Sample-efficient cross-entropy method for real-time planning

C Pinneri, S Sawant, S Blaes… - … on Robot Learning, 2021 - proceedings.mlr.press

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy
Method (CEM), can yield compelling results even in high-dimensional control tasks and …

Desa Cita Citat per 119 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Es-maml: Simple hessian-free meta learning

X Song, W Gao, Y Yang, K Choromanski… - arxiv preprint arxiv …, 2019 - arxiv.org

We introduce ES-MAML, a new framework for solving the model agnostic meta learning
(MAML) problem based on Evolution Strategies (ES). Existing algorithms for MAML are …

Desa Cita Citat per 145 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Deep reinforcement learning versus evolution strategies: A comparative survey

AY Majid, S Saaybi, V Francois-Lavet… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Deep reinforcement learning (DRL) and evolution strategies (ESs) have surpassed human-
level control in many sequential decision-making problems, yet many open challenges still …

Desa Cita Citat per 73 Articles relacionats Totes les 17 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points

K Balasubramanian, S Ghadimi - Foundations of Computational …, 2022 - Springer

In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for
nonconvex and convex optimization, with a focus on addressing constrained optimization …

Desa Cita Citat per 121 Articles relacionats Totes les 9 versions Free GPT-4 DeepSeek

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Structured evolution with compact architectures for scalable policy optimization

Evolutionary reinforcement learning: A survey

Derivative-free reinforcement learning: A review

A theoretical and empirical comparison of gradient approximations in derivative-free optimization

Effective diversity in population based reinforcement learning

i-sim2real: Reinforcement learning of robotic policies in tight human-robot interaction loops

Observational overfitting in reinforcement learning

Sample-efficient cross-entropy method for real-time planning

Es-maml: Simple hessian-free meta learning

Deep reinforcement learning versus evolution strategies: A comparative survey

Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points