Google Academic

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

Salvați Citați Citat de 398 ori Articole cu conținut similar Toate cele 5 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

A tutorial on thompson sampling

DJ Russo, B Van Roy, A Kazerouni… - … and Trends® in …, 2018 - nowpublishers.com

Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …

Salvați Citați Citat de 1263 ori Articole cu conținut similar Toate cele 19 versiuni Căutare Bibliotecă Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation

C **, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Salvați Citați Citat de 777 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Model-based reinforcement learning with value-targeted regression

A Ayoub, Z Jia, C Szepesvari… - … on Machine Learning, 2020 - proceedings.mlr.press

This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …

Salvați Citați Citat de 348 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Is Q-learning provably efficient?

C **, Z Allen-Zhu, S Bubeck… - Advances in neural …, 2018 - proceedings.neurips.cc

Abstract Model-free reinforcement learning (RL) algorithms directly parameterize and
update value functions or policies, bypassing the modeling of the environment. They are …

Salvați Citați Citat de 1028 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Noisy networks for exploration

M Fortunato, MG Azar, B Piot, J Menick… - arxiv preprint arxiv …, 2017 - arxiv.org

We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to
its weights, and show that the induced stochasticity of the agent's policy can be used to aid …

Salvați Citați Citat de 1212 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Sunrise: A simple unified framework for ensemble learning in deep reinforcement learning

K Lee, M Laskin, A Srinivas… - … Conference on Machine …, 2021 - proceedings.mlr.press

Off-policy deep reinforcement learning (RL) has been successful in a range of challenging
domains. However, standard off-policy RL algorithms can suffer from several issues, such as …

Salvați Citați Citat de 263 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provably efficient exploration in policy optimization

Q Cai, Z Yang, C **, Z Wang - International Conference on …, 2020 - proceedings.mlr.press

While policy-based reinforcement learning (RL) achieves tremendous successes in practice,
it is significantly less understood in theory, especially compared with value-based RL. In …

Salvați Citați Citat de 324 ori Articole cu conținut similar Toate cele 10 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parameter space noise for exploration

M Plappert, R Houthooft, P Dhariwal, S Sidor… - arxiv preprint arxiv …, 2017 - arxiv.org

Deep reinforcement learning (RL) methods generally engage in exploratory behavior
through noise injection in the action space. An alternative is to add noise directly to the …

Salvați Citați Citat de 788 ori Articole cu conținut similar Toate cele 13 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Randomized prior functions for deep reinforcement learning

I Osband, J Aslanides… - Advances in neural …, 2018 - proceedings.neurips.cc

Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing
literature on uncertainty estimation for deep learning from fixed datasets, but many of the …

Salvați Citați Citat de 475 ori Articole cu conținut similar Toate cele 10 versiuni Afișare ca HTML

Citați

Căutare avansată

Salvat în Bibliotecă

Exploration in deep reinforcement learning: A survey

A tutorial on thompson sampling

Provably efficient reinforcement learning with linear function approximation

Model-based reinforcement learning with value-targeted regression

Is Q-learning provably efficient?

Noisy networks for exploration

Sunrise: A simple unified framework for ensemble learning in deep reinforcement learning

Provably efficient exploration in policy optimization

Parameter space noise for exploration

Randomized prior functions for deep reinforcement learning