Policy mirror descent for reinforcement learning: Linear convergence, new sampling complexity, and generalized problem classes
G Lan - Mathematical programming, 2023 - Springer
We present new policy mirror descent (PMD) methods for solving reinforcement learning
(RL) problems with either strongly convex or general convex regularizers. By exploring the …
(RL) problems with either strongly convex or general convex regularizers. By exploring the …
Learning with little mixing
We study square loss in a realizable time-series framework with martingale difference noise.
Our main result is a fast rate excess risk bound which shows that whenever a trajectory …
Our main result is a fast rate excess risk bound which shows that whenever a trajectory …
Safe, learning-based MPC for highway driving under lane-change uncertainty: A distributionally robust approach
We present a case study applying learning-based distributionally robust model predictive
control to highway motion planning under stochastic uncertainty of the lane change behavior …
control to highway motion planning under stochastic uncertainty of the lane change behavior …
Active learning for nonlinear system identification with guarantees
While the identification of nonlinear dynamical systems is a fundamental building block of
model-based reinforcement learning and feedback control, its sample complexity is only …
model-based reinforcement learning and feedback control, its sample complexity is only …
Estimating the mixing time of ergodic markov chains
We address the problem of estimating the mixing time $ t_ {\mathsf {mix}} $ of an arbitrary
ergodic finite Markov chain from a single trajectory of length $ m $. The reversible case was …
ergodic finite Markov chain from a single trajectory of length $ m $. The reversible case was …
A general framework for learning-based distributionally robust MPC of Markov jump systems
In this article, we present a data-driven learning model predictive control (MPC) scheme for
chance-constrained Markov jump systems with unknown switching probabilities. Using …
chance-constrained Markov jump systems with unknown switching probabilities. Using …
Adaptive data analysis with correlated observations
The vast majority of the work on adaptive data analysis focuses on the case where the
samples in the dataset are independent. Several approaches and tools have been …
samples in the dataset are independent. Several approaches and tools have been …