Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Toward a theoretical foundation of policy optimization for learning control policies
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …
diverse application domains. Recently, there has been a renewed interest in studying …
Constrained-cost adaptive dynamic programming for optimal control of discrete-time nonlinear systems
Q Wei, T Li - IEEE Transactions on Neural Networks and …, 2023 - ieeexplore.ieee.org
For discrete-time nonlinear systems, this research is concerned with optimal control
problems (OCPs) with constrained cost, and a novel value iteration with constrained cost …
problems (OCPs) with constrained cost, and a novel value iteration with constrained cost …
On the optimization landscape of dynamic output feedback linear quadratic control
The convergence of policy gradient algorithms hinges on the optimization landscape of the
underlying optimal control problem. Theoretical insights into these algorithms can often be …
underlying optimal control problem. Theoretical insights into these algorithms can often be …
Global Convergence of Direct Policy Search for State-Feedback Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
Direct policy search has been widely applied in modern reinforcement learning and
continuous control. However, the theoretical properties of direct policy search on nonsmooth …
continuous control. However, the theoretical properties of direct policy search on nonsmooth …
Complexity of Derivative-Free Policy Optimization for Structured Control
The applications of direct policy search in reinforcement learning and continuous control
have received increasing attention. In this work, we present novel theoretical results on the …
have received increasing attention. In this work, we present novel theoretical results on the …
Provably efficient generalized lagrangian policy optimization for safe multi-agent reinforcement learning
We examine online safe multi-agent reinforcement learning using constrained Markov
games in which agents compete by maximizing their expected total rewards under a …
games in which agents compete by maximizing their expected total rewards under a …
Infinite-horizon risk-constrained linear quadratic regulator with average cost
The behaviour of a stochastic dynamical system may be largely influenced by those low-
probability, yet extreme events. To address such occurrences, this paper proposes an …
probability, yet extreme events. To address such occurrences, this paper proposes an …
Reinforcement learning for linear exponential quadratic Gaussian problem
J Lai, J **ong - Systems & Control Letters, 2024 - Elsevier
This paper addresses the infinite-horizon linear exponential quadratic Gaussian problem for
a class of stochastic systems with additive noise. A model-free generalized policy iteration …
a class of stochastic systems with additive noise. A model-free generalized policy iteration …
Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithms
We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-
dimensional partial differential equation (PDE)-based control problems. Integrated within the …
dimensional partial differential equation (PDE)-based control problems. Integrated within the …
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
We study the problem of computing deterministic optimal policies for constrained Markov
decision processes (MDPs) with continuous state and action spaces, which are widely …
decision processes (MDPs) with continuous state and action spaces, which are widely …