Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Deep deterministic policy gradient algorithm: A systematic review
Abstract Deep Reinforcement Learning (DRL) has gained significant adoption in diverse
fields and applications, mainly due to its proficiency in resolving complicated decision …
fields and applications, mainly due to its proficiency in resolving complicated decision …
Policy gradient and actor-critic learning in continuous time and space: Theory and algorithms
We study policy gradient (PG) for reinforcement learning in continuous time and space
under the regularized exploratory formulation developed by Wang et al.(2020). We …
under the regularized exploratory formulation developed by Wang et al.(2020). We …
Text-based interactive recommendation via constraint-augmented reinforcement learning
Text-based interactive recommendation provides richer user preferences and has
demonstrated advantages over traditional interactive recommender systems. However …
demonstrated advantages over traditional interactive recommender systems. However …
Policy optimization for continuous reinforcement learning
We study reinforcement learning (RL) in the setting of continuous time and space, for an
infinite horizon with a discounted objective and the underlying dynamics driven by a …
infinite horizon with a discounted objective and the underlying dynamics driven by a …
Data-driven robotic manipulation of cloth-like deformable objects: The present, challenges and future prospects
Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the
robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level …
robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level …
q-Learning in continuous time
We study the continuous-time counterpart of Q-learning for reinforcement learning (RL)
under the entropy-regularized, exploratory diffusion process formulation introduced by Wang …
under the entropy-regularized, exploratory diffusion process formulation introduced by Wang …
Model-based reinforcement learning for semi-markov decision processes with neural odes
We present two elegant solutions for modeling continuous-time dynamics, in a novel model-
based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs) …
based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs) …
Reinforcement learning for jump-diffusions, with financial applications
We study continuous-time reinforcement learning (RL) for stochastic control in which system
dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized …
dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized …
Efficient exploration in continuous-time model-based reinforcement learning
Reinforcement learning algorithms typically consider discrete-time dynamics, even though
the underlying systems are often continuous in time. In this paper, we introduce a model …
the underlying systems are often continuous in time. In this paper, we introduce a model …
Control frequency adaptation via action persistence in batch reinforcement learning
The choice of the control frequency of a system has a relevant impact on the ability of
reinforcement learning algorithms to learn a highly performing policy. In this paper, we …
reinforcement learning algorithms to learn a highly performing policy. In this paper, we …