A tour of reinforcement learning: The view from continuous control

B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org
This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …

Statistical learning theory for control: A finite-sample perspective

A Tsiamis, I Ziemann, N Matni… - IEEE Control Systems …, 2023 - ieeexplore.ieee.org
Learning algorithms have become an integral component to modern engineering solutions.
Examples range from self-driving cars and recommender systems to finance and even …

Online control with adversarial disturbances

N Agarwal, B Bullins, E Hazan… - International …, 2019 - proceedings.mlr.press
We study the control of linear dynamical systems with adversarial disturbances, as opposed
to statistical noise. We present an efficient algorithm that achieves nearly-tight regret bounds …

Naive exploration is optimal for online lqr

M Simchowitz, D Foster - International Conference on …, 2020 - proceedings.mlr.press
We consider the problem of online adaptive control of the linear quadratic regulator, where
the true system parameters are unknown. We prove new upper and lower bounds …

Certainty equivalence is efficient for linear quadratic control

H Mania, S Tu, B Recht - Advances in Neural Information …, 2019 - proceedings.neurips.cc
We study the performance of the certainty equivalent controller on Linear Quadratic (LQ)
control problems with unknown transition dynamics. We show that for both the fully and …

Model-based rl in contextual decision processes: Pac bounds and exponential improvements over model-free approaches

W Sun, N Jiang, A Krishnamurthy… - … on learning theory, 2019 - proceedings.mlr.press
We study the sample complexity of model-based reinforcement learning (henceforth RL) in
general contextual decision processes that require strategic exploration to find a near …

Derivative-free methods for policy optimization: Guarantees for linear quadratic systems

D Malik, A Pananjady, K Bhatia, K Khamaru… - Journal of Machine …, 2020 - jmlr.org
We study derivative-free methods for policy optimization over the class of linear policies. We
focus on characterizing the convergence rate of these methods when applied to linear …

Learning Linear-Quadratic Regulators Efficiently with only $\sqrtT $ Regret

A Cohen, T Koren, Y Mansour - International Conference on …, 2019 - proceedings.mlr.press
We present the first computationally-efficient algorithm with $\widetilde {O}(\sqrt {T}) $ regret
for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve …

Information theoretic regret bounds for online nonlinear control

S Kakade, A Krishnamurthy, K Lowrey… - Advances in …, 2020 - proceedings.neurips.cc
This work studies the problem of sequential control in an unknown, nonlinear dynamical
system, where we model the underlying system dynamics as an unknown function in a …

System level synthesis

J Anderson, JC Doyle, SH Low, N Matni - Annual Reviews in Control, 2019 - Elsevier
This article surveys the System Level Synthesis framework, which presents a novel
perspective on constrained robust and optimal controller synthesis for linear systems. We …