Comparing model-free and model-based algorithms for offline reinforcement learning

P Swazinna, S Udluft, D Hein, T Runkler - IFAC-PapersOnLine, 2022 - Elsevier
Offline reinforcement learning (RL) Algorithms are often designed with environments such
as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We …

User-interactive offline reinforcement learning

P Swazinna, S Udluft, T Runkler - arxiv preprint arxiv:2205.10629, 2022 - arxiv.org
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the
learned policy performs worse than the original policy that generated the dataset or behaves …

Measuring data quality for dataset selection in offline reinforcement learning

P Swazinna, S Udluft, T Runkler - 2021 IEEE Symposium …, 2021 - ieeexplore.ieee.org
Recently developed offline reinforcement learning algorithms have made it possible to learn
policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners …

Model-based Offline Quantum Reinforcement Learning

S Eisenmann, D Hein, S Udluft… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
This paper presents the first algorithm for model-based offline quantum reinforcement
learning and demonstrates its functionality on the cart-pole benchmark. The model and the …

Towards user-interactive offline reinforcement learning

P Swazinna, S Udluft, T Runkler - … RL Workshop: Offline RL as a'' …, 2022 - openreview.net
Offline reinforcement learning algorithms are still not fully trusted by practitioners due to the
risk that the learned policy performs worse than the original policy that generated the dataset …

Policy Regularization for Model-Based Offline Reinforcement Learning

PA Swazinna - 2023 - mediatum.ub.tum.de
This thesis proposes three novel algorithms for offline reinforcement learning, which allow
for training policies from pre-collected datasets without direct environment interaction. The …