- Academic Search

K Chatzilygeroudis, V Vassiliades… - IEEE Transactions …, 2019 - ieeexplore.ieee.org

Most policy search (PS) algorithms require thousands of training episodes to find an
effective policy, which is often infeasible with a physical robot. This survey article focuses on …

保存引用被引用数: 207 関連記事全 17 バージョン

[Free GPT-4]

[PDF] neurips.cc

Mopo: Model-based offline policy optimization

T Yu, G Thomas, L Yu, S Ermon… - Advances in …, 2020 - proceedings.neurips.cc

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a
batch of previously collected data. This problem setting is compelling, because it offers the …

保存引用被引用数: 904 関連記事全 11 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Rvs: What is essential for offline rl via supervised learning?

S Emmons, B Eysenbach, I Kostrikov… - arxiv preprint arxiv …, 2021 - arxiv.org

Recent work has shown that supervised learning alone, without temporal difference (TD)
learning, can be remarkably effective for offline RL. When does this hold true, and which …

保存引用被引用数: 209 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

When to trust your model: Model-based policy optimization

M Janner, J Fu, M Zhang… - Advances in neural …, 2019 - proceedings.neurips.cc

Designing effective model-based reinforcement learning algorithms is difficult because the
ease of data generation must be weighed against the bias of model-generated data. In this …

保存引用被引用数: 1111 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

[Free GPT-4]

[PDF] neurips.cc

Recurrent world models facilitate policy evolution

D Ha, J Schmidhuber - Advances in neural information …, 2018 - proceedings.neurips.cc

A generative recurrent neural network is quickly trained in an unsupervised manner to
model popular reinforcement learning environments through compressed spatio-temporal …

保存引用被引用数: 1143 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

Deep reinforcement learning in a handful of trials using probabilistic dynamics models

K Chua, R Calandra, R McAllister… - Advances in neural …, 2018 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) algorithms can attain excellent sample
efficiency, but often lag behind the best model-free algorithms in terms of asymptotic …

保存引用被引用数: 1593 関連記事全 8 バージョン HTMLバージョン

[Free GPT-4]

[PDF] 106.54.215.74

[PDF][PDF] Uncertainty in deep learning

Y Gal - 2016 - 106.54.215.74

PowerPoint 演示文稿 Page 1 Uncertainty in Deep Learning Yarin Gal 2018.7.29 Page 2 Page
3 Different Uncertainties Two main types of uncertainty, often confused by practitioners, but …

保存引用被引用数: 2163 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Model-ensemble trust-region policy optimization

T Kurutach, I Clavera, Y Duan, A Tamar… - arxiv preprint arxiv …, 2018 - arxiv.org

Model-free reinforcement learning (RL) methods are succeeding in a growing number of
tasks, aided by recent advances in deep learning. However, they tend to suffer from high …

保存引用被引用数: 570 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

Sample-efficient reinforcement learning with stochastic ensemble value expansion

J Buckman, D Hafner, G Tucker… - Advances in neural …, 2018 - proceedings.neurips.cc

There is growing interest in combining model-free and model-based approaches in
reinforcement learning with the goal of achieving the high performance of model-free …

保存引用被引用数: 416 関連記事全 8 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Learning and policy search in stochastic dynamical systems with bayesian neural networks

A survey on policy search algorithms for learning robot controllers in a handful of trials

Mopo: Model-based offline policy optimization

Rvs: What is essential for offline rl via supervised learning?

When to trust your model: Model-based policy optimization

Model-based reinforcement learning: A survey

Recurrent world models facilitate policy evolution

Deep reinforcement learning in a handful of trials using probabilistic dynamics models

[PDF][PDF] Uncertainty in deep learning

Model-ensemble trust-region policy optimization

Sample-efficient reinforcement learning with stochastic ensemble value expansion