Академия Google

Y Hu, FJ Abu-Dakka, F Chen, X Luo, Z Li, A Knoll… - Information …, 2024 - Elsevier

Imitation Learning (IL), also referred to as Learning from Demonstration (LfD), holds
significant promise for capturing expert motor skills through efficient imitation, facilitating …

Сохранить Цитировать Цитируется: 4 Похожие статьи Все версии статьи (7)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

One pixel attack for fooling deep neural networks

J Su, DV Vargas, K Sakurai - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

Recent research has revealed that the output of deep neural networks (DNNs) can be easily
altered by adding relatively small perturbations to the input vector. In this paper, we analyze …

Сохранить Цитировать Цитируется: 3073 Похожие статьи Все версии статьи (9)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Maximum a posteriori policy optimisation

A Abdolmaleki, JT Springenberg, Y Tassa… - arxiv preprint arxiv …, 2018 - arxiv.org

We introduce a new algorithm for reinforcement learning called Maximum aposteriori Policy
Optimisation (MPO) based on coordinate ascent on a relative entropy objective. We show …

Сохранить Цитировать Цитируется: 557 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

HF Song, A Abdolmaleki, JT Springenberg… - arxiv preprint arxiv …, 2019 - arxiv.org

Some of the most successful applications of deep reinforcement learning to challenging
domains in discrete and continuous control have used policy gradient methods in the on …

Сохранить Цитировать Цитируется: 133 Похожие статьи Все версии статьи (3) В виде HTML

Evolution strategies for continuous optimization: A survey of the state-of-the-art

Z Li, X Lin, Q Zhang, H Liu - Swarm and Evolutionary Computation, 2020 - Elsevier

Evolution strategies are a class of evolutionary algorithms for black-box optimization and
achieve state-of-the-art performance on many benchmarks and real-world applications …

Сохранить Цитировать Цитируется: 80 Похожие статьи Все версии статьи (2)

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Variational inference mpc for bayesian model-based reinforcement learning

M Okada, T Taniguchi - Conference on robot learning, 2020 - proceedings.mlr.press

In recent studies on model-based reinforcement learning (MBRL), incorporating uncertainty
in forward dynamics is a state-of-the-art strategy to enhance learning performance, making …

Сохранить Цитировать Цитируется: 80 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PPO-CMA: Proximal policy optimization with covariance matrix adaptation

P Hämäläinen, A Babadi, X Ma… - 2020 IEEE 30th …, 2020 - ieeexplore.ieee.org

Proximal Policy Optimization (PPO) is a highly popular model-free reinforcement learning
(RL) approach. However, we observe that in a continuous action space, PPO can …

Сохранить Цитировать Цитируется: 84 Похожие статьи Все версии статьи (8)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Relative entropy regularized policy iteration

A Abdolmaleki, JT Springenberg, J Degrave… - arxiv preprint arxiv …, 2018 - arxiv.org

We present an off-policy actor-critic algorithm for Reinforcement Learning (RL) that
combines ideas from gradient-free optimization via stochastic search with learned action …

Сохранить Цитировать Цитируется: 79 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Entropic risk measure in policy search

D Nass, B Belousov, J Peters - 2019 IEEE/RSJ International …, 2019 - ieeexplore.ieee.org

With the increasing pace of automation, modern robotic systems need to act in stochastic,
non-stationary, partially observable environments. A range of algorithms for finding …

Сохранить Цитировать Цитируется: 46 Похожие статьи Все версии статьи (14) Поиск в библиотеках

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

High acceleration reinforcement learning for real-world juggling with binary rewards

K Ploeger, M Lutter, J Peters - Conference on Robot …, 2021 - proceedings.mlr.press

Robots that can learn in the physical world will be important to enable robots to escape their
stiff and pre-programmed movements. For dynamic high-acceleration tasks, such as …

Сохранить Цитировать Цитируется: 33 Похожие статьи Все версии статьи (10) Поиск в библиотеках В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Deriving and improving cma-es with information geometric trust regions

Fusion dynamical systems with machine learning in imitation learning: A comprehensive overview

One pixel attack for fooling deep neural networks

Maximum a posteriori policy optimisation

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

Evolution strategies for continuous optimization: A survey of the state-of-the-art

Variational inference mpc for bayesian model-based reinforcement learning

PPO-CMA: Proximal policy optimization with covariance matrix adaptation

Relative entropy regularized policy iteration

Entropic risk measure in policy search

High acceleration reinforcement learning for real-world juggling with binary rewards