Deep deterministic policy gradient algorithm: A systematic review

EH Sumiea, SJ Abdulkadir, HS Alhussian, SM Al-Selwi… - Heliyon, 2024 - cell.com
Abstract Deep Reinforcement Learning (DRL) has gained significant adoption in diverse
fields and applications, mainly due to its proficiency in resolving complicated decision …

Policy gradient and actor-critic learning in continuous time and space: Theory and algorithms

Y Jia, XY Zhou - Journal of Machine Learning Research, 2022 - jmlr.org
We study policy gradient (PG) for reinforcement learning in continuous time and space
under the regularized exploratory formulation developed by Wang et al.(2020). We …

Text-based interactive recommendation via constraint-augmented reinforcement learning

R Zhang, T Yu, Y Shen, H **… - Advances in neural …, 2019 - proceedings.neurips.cc
Text-based interactive recommendation provides richer user preferences and has
demonstrated advantages over traditional interactive recommender systems. However …

Policy optimization for continuous reinforcement learning

H Zhao, W Tang, D Yao - Advances in Neural Information …, 2023 - proceedings.neurips.cc
We study reinforcement learning (RL) in the setting of continuous time and space, for an
infinite horizon with a discounted objective and the underlying dynamics driven by a …

Data-driven robotic manipulation of cloth-like deformable objects: The present, challenges and future prospects

HA Kadi, K Terzić - Sensors, 2023 - mdpi.com
Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the
robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level …

q-Learning in continuous time

Y Jia, XY Zhou - Journal of Machine Learning Research, 2023 - jmlr.org
We study the continuous-time counterpart of Q-learning for reinforcement learning (RL)
under the entropy-regularized, exploratory diffusion process formulation introduced by Wang …

Model-based reinforcement learning for semi-markov decision processes with neural odes

J Du, J Futoma, F Doshi-Velez - Advances in Neural …, 2020 - proceedings.neurips.cc
We present two elegant solutions for modeling continuous-time dynamics, in a novel model-
based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs) …

Reinforcement learning for jump-diffusions, with financial applications

X Gao, L Li, XY Zhou - arxiv preprint arxiv:2405.16449, 2024 - arxiv.org
We study continuous-time reinforcement learning (RL) for stochastic control in which system
dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized …

Efficient exploration in continuous-time model-based reinforcement learning

L Treven, J Hübotter, B Sukhija… - Advances in Neural …, 2023 - proceedings.neurips.cc
Reinforcement learning algorithms typically consider discrete-time dynamics, even though
the underlying systems are often continuous in time. In this paper, we introduce a model …

Control frequency adaptation via action persistence in batch reinforcement learning

AM Metelli, F Mazzolini, L Bisi… - International …, 2020 - proceedings.mlr.press
The choice of the control frequency of a system has a relevant impact on the ability of
reinforcement learning algorithms to learn a highly performing policy. In this paper, we …