Continuous-time reinforcement learning control: A review of theoretical results, insights on performance, and needs for new designs

BA Wallace, J Si - IEEE Transactions on Neural Networks and …, 2023‏ - ieeexplore.ieee.org
This exposition discusses continuous-time reinforcement learning (CT-RL) for the control of
affine nonlinear systems. We review four seminal methods that are the centerpieces of the …

Generalized policy iteration using tensor approximation for hybrid control

S Shetty, T Xue, S Calinon - The Twelfth International Conference …, 2024‏ - openreview.net
Control of dynamic systems involving hybrid actions is a challenging task in robotics. To
address this, we present a novel algorithm called Generalized Policy Iteration using Tensor …

Reinforcement twinning: From digital twins to model-based reinforcement learning

L Schena, PA Marques, R Poletti, S Ahizi… - Journal of …, 2024‏ - Elsevier
The concept of digital twins promises to revolutionize engineering by offering new avenues
for optimization, control, and predictive maintenance. We propose a novel framework for …

Kernel-Based Optimal Control: An Infinitesimal Generator Approach

P Bevanda, N Hosichen, T Wittmann… - arxiv preprint arxiv …, 2024‏ - arxiv.org
This paper presents a novel approach for optimal control of nonlinear stochastic systems
using infinitesimal generator learning within infinite-dimensional reproducing kernel Hilbert …

Continuous-time reinforcement learning: New design algorithms with theoretical insights and performance guarantees

BA Wallace, J Si - IEEE Transactions on Neural Networks and …, 2024‏ - ieeexplore.ieee.org
Continuous-time reinforcement learning (CT-RL) methods hold great promise in real-world
applications. Adaptive dynamic programming (ADP)-based CT-RL algorithms, especially …

Reinforcement learning control of hypersonic vehicles and performance evaluations

BA Wallace, J Si - Journal of Guidance, Control, and Dynamics, 2024‏ - arc.aiaa.org
This work presents a new framework for model-based continuous-time reinforcement
learning (CT-RL) control of hypersonic vehicles (HSVs). The predominant classes of CT-RL …

Managing temporal resolution in continuous value estimation: a fundamental trade-off

ZV Zhang, J Kirschner, J Zhang… - Advances in …, 2023‏ - proceedings.neurips.cc
A default assumption in reinforcement learning (RL) and optimal control is that observations
arrive at discrete time points on a fixed clock cycle. Yet, many applications involve …

Mitigating the curse of horizon in Monte-Carlo returns

A Ayoub, D Szepesvari, F Zanini, B Chan… - Reinforcement …, 2024‏ - openreview.net
The standard framework in reinforcement learning (RL) dictates that an agent should use
every observation collected from interactions with the environment when updating its value …

Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

D Bertsimas, CW Kim, J Niño-Mora - arxiv preprint arxiv:2502.03725, 2025‏ - arxiv.org
We propose a machine learning approach to the optimal control of fluid restless multi-armed
bandits (FRMABs) with state equations that are either affine or quadratic in the state …

A New, Physics-Informed Continuous-Time Reinforcement Learning Algorithm with Performance Guarantees

BA Wallace, J Si - Journal of Machine Learning Research, 2024‏ - jmlr.org
We introduce a new, physics-informed continuous-time reinforcement learning (CT-RL)
algorithm for control of affine nonlinear systems, an area that enables a plethora of well …