Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Toward a theoretical foundation of policy optimization for learning control policies
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …
diverse application domains. Recently, there has been a renewed interest in studying …
Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem
Model-free reinforcement learning attempts to find an optimal control action for an unknown
dynamical system by directly searching over the parameter space of controllers. The …
dynamical system by directly searching over the parameter space of controllers. The …
DeeP-LCC: Data-enabled predictive leading cruise control in mixed traffic flow
For the control of connected and autonomous vehicles (CAVs), most existing methods focus
on model-based strategies. They require explicit knowledge of car-following dynamics of …
on model-based strategies. They require explicit knowledge of car-following dynamics of …
Optimizing static linear feedback: Gradient method
The linear quadratic regulator is the fundamental problem of optimal control. Its state
feedback version was set and solved in the early 1960s. However, the static output feedback …
feedback version was set and solved in the early 1960s. However, the static output feedback …
Sample complexity of linear quadratic gaussian (LQG) control for output feedback systems
This paper studies a class of partially observed Linear Quadratic Gaussian (LQG) problems
with unknown dynamics. We establish an end-to-end sample complexity bound on learning …
with unknown dynamics. We establish an end-to-end sample complexity bound on learning …
On the optimization landscape of dynamic output feedback linear quadratic control
The convergence of policy gradient algorithms hinges on the optimization landscape of the
underlying optimal control problem. Theoretical insights into these algorithms can often be …
underlying optimal control problem. Theoretical insights into these algorithms can often be …
Derivative-free policy optimization for linear risk-sensitive and robust control design: Implicit regularization and sample complexity
Direct policy search serves as one of the workhorses in modern reinforcement learning (RL),
and its applications in continuous control tasks have recently attracted increasing attention …
and its applications in continuous control tasks have recently attracted increasing attention …
On the stability and convergence of robust adversarial reinforcement learning: A case study on linear quadratic systems
Reinforcement learning (RL) algorithms can fail to generalize due to the gap between the
simulation and the real world. One standard remedy is to use robust adversarial RL (RARL) …
simulation and the real world. One standard remedy is to use robust adversarial RL (RARL) …
Learning the Kalman filter with fine-grained sample complexity
We develop the first end-to-end sample complexity of model-free policy gradient (PG)
methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the …
methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the …
Global convergence of policy gradient primal–dual methods for risk-constrained LQRs
While the techniques in optimal control theory are often model-based, the policy optimization
(PO) approach directly optimizes the performance metric of interest. Even though it has been …
(PO) approach directly optimizes the performance metric of interest. Even though it has been …