- Academic Search

T Liao, R Taori, ID Raji, L Schmidt - Thirty-fifth Conference on …, 2021 - openreview.net

Many subfields of machine learning share a common stumbling block: evaluation. Advances
in machine learning often evaporate under closer scrutiny or turn out to be less widely …

Opslaan Citeren Geciteerd door 129 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autonomous unmanned aerial vehicle navigation using reinforcement learning: A systematic review

F AlMahamid, K Grolinger - Engineering Applications of Artificial …, 2022 - Elsevier

There is an increasing demand for using Unmanned Aerial Vehicle (UAV), known as drones,
in different applications such as packages delivery, traffic monitoring, search and rescue …

Opslaan Citeren Geciteerd door 84 Verwante artikelen Alle 11 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mastering diverse domains through world models

D Hafner, J Pasukonis, J Ba, T Lillicrap - ar** a general algorithm that learns to solve tasks across a wide range of
applications has been a fundamental challenge in artificial intelligence. Although current …

Opslaan Citeren Geciteerd door 545 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[HTML] nature.com

[HTML][HTML] Magnetic control of tokamak plasmas through deep reinforcement learning

J Degrave, F Felici, J Buchli, M Neunert, B Tracey… - Nature, 2022 - nature.com

Nuclear fusion using magnetic confinement, in particular in the tokamak configuration, is a
promising path towards sustainable energy. A core challenge is to shape and maintain a …

Opslaan Citeren Geciteerd door 928 Verwante artikelen Alle 13 versies

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - ar** and mitigating misaligned models

A Pan, K Bhatia, J Steinhardt - arxiv preprint arxiv:2201.03544, 2022 - arxiv.org

Reward hacking--where RL agents exploit gaps in misspecified reward functions--has been
widely observed, but not yet systematically studied. To understand how reward hacking …

Opslaan Citeren Geciteerd door 164 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Decoupling value and policy for generalization in reinforcement learning

R Raileanu, R Fergus - International Conference on …, 2021 - proceedings.mlr.press

Standard deep reinforcement learning algorithms use a shared representation for the policy
and value function, especially when training directly from images. However, we argue that …

Opslaan Citeren Geciteerd door 113 Verwante artikelen Alle 6 versies HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

What matters in on-policy reinforcement learning? a large-scale empirical study

Are we learning yet? a meta review of evaluation failures across machine learning

Autonomous unmanned aerial vehicle navigation using reinforcement learning: A systematic review

Mastering diverse domains through world models

[HTML][HTML] Magnetic control of tokamak plasmas through deep reinforcement learning

What matters in learning from offline human demonstrations for robot manipulation

Decoupling value and policy for generalization in reinforcement learning