- Academic Search

L Treven, J Hübotter, F Dorfler… - Advances in Neural …, 2024 - proceedings.neurips.cc

Reinforcement learning algorithms typically consider discrete-time dynamics, even though
the underlying systems are often continuous in time. In this paper, we introduce a model …

Simpan Kutip Dirujuk 6 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Do Transformer World Models Give Better Policy Gradients?

M Ma, T Ni, C Gehring, P D'Oro, PL Bacon - arxiv preprint arxiv …, 2024 - arxiv.org

A natural approach for reinforcement learning is to predict future rewards by unrolling a
neural network world model, and to backpropagate through the resulting computational …

Simpan Kutip Dirujuk 1 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Pontryagin Perspective on Reinforcement Learning

O Eberhard, C Vernade, M Muehlebach - arxiv preprint arxiv:2405.18100, 2024 - arxiv.org

Reinforcement learning has traditionally focused on learning state-dependent policies to
solve optimal control problems in a closed-loop fashion. In this work, we introduce the …

Simpan Kutip Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

A Differentiable Sequence Model Perspective on Policy Gradients

M Ma, P D'Oro, T Ni, C Gehring, PL Bacon - openreview.net

Progress in sequence modeling with deep learning has been driven by the advances in
temporal credit assignment coming from better gradient propagation in neural network …

Simpan Kutip Artikel terkait Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Myriad: a real-world testbed to bridge trajectory optimization and deep learning

Efficient exploration in continuous-time model-based reinforcement learning

Do Transformer World Models Give Better Policy Gradients?

A Pontryagin Perspective on Reinforcement Learning

A Differentiable Sequence Model Perspective on Policy Gradients