Revisiting LQR control from the perspective of receding-horizon policy gradient
We revisit in this letter the discrete-time linear quadratic regulator (LQR) problem from the
perspective of receding-horizon policy gradient (RHPG), a newly developed model-free …
perspective of receding-horizon policy gradient (RHPG), a newly developed model-free …
Controlgym: Large-scale safety-critical control environments for benchmarking reinforcement learning algorithms
We introduce controlgym, a library of thirty-six safety-critical industrial control settings, and
ten infinite-dimensional partial differential equation (PDE)-based control problems …
ten infinite-dimensional partial differential equation (PDE)-based control problems …
Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithms
We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-
dimensional partial differential equation (PDE)-based control problems. Integrated within the …
dimensional partial differential equation (PDE)-based control problems. Integrated within the …
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Markov Decision Processes (MDPs) are a formal framework for modeling and solving
sequential decision-making problems. In finite-time horizons such problems are relevant for …
sequential decision-making problems. In finite-time horizons such problems are relevant for …
Structure Matters: Dynamic Policy Gradient
In this work, we study $\gamma $-discounted infinite-horizon tabular Markov decision
processes (MDPs) and introduce a framework called dynamic policy gradient (DynPG). The …
processes (MDPs) and introduce a framework called dynamic policy gradient (DynPG). The …
Decision Transformer as a Foundation Model for Partially Observable Continuous Control
Closed-loop control of nonlinear dynamical systems with partial-state observability demands
expert knowledge of a diverse, less standardized set of theoretical tools. Moreover, it …
expert knowledge of a diverse, less standardized set of theoretical tools. Moreover, it …
Policy Optimization for PDE Control with a Warm Start
Dimensionality reduction is crucial for controlling nonlinear partial differential equations
(PDE) through a" reduce-then-design" strategy, which identifies a reduced-order model and …
(PDE) through a" reduce-then-design" strategy, which identifies a reduced-order model and …
Dynamic approaches for stochastic gradient methods in reinforcement learning
S Klein - 2024 - madoc.bib.uni-mannheim.de
This work addresses the convergence behaviour of first-order optimization methods in the
context of reinforcement learning. Specifically, we analyse the vanilla Policy Gradient (PG) …
context of reinforcement learning. Specifically, we analyse the vanilla Policy Gradient (PG) …