- Academic Search

Policy mirror descent for reinforcement learning: Linear convergence, new sampling complexity, and generalized problem classes

G Lan - Mathematical programming, 2023 - Springer

We present new policy mirror descent (PMD) methods for solving reinforcement learning
(RL) problems with either strongly convex or general convex regularizers. By exploring the …

Save Cite Cited by 156 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Learning with little mixing

I Ziemann, S Tu - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We study square loss in a realizable time-series framework with martingale difference noise.
Our main result is a fast rate excess risk bound which shows that whenever a trajectory …

Save Cite Cited by 34 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Active learning for nonlinear system identification with guarantees

H Mania, MI Jordan, B Recht - ar** the system's states into a small number …

Save Cite Cited by 66 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Safe, learning-based MPC for highway driving under lane-change uncertainty: A distributionally robust approach

M Schuurmans, A Katriniok, C Meissen, HE Tseng… - Artificial Intelligence, 2023 - Elsevier

We present a case study applying learning-based distributionally robust model predictive
control to highway motion planning under stochastic uncertainty of the lane change behavior …

Save Cite Cited by 20 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] jmlr.org

Active learning for nonlinear system identification with guarantees

H Mania, MI Jordan, B Recht - Journal of Machine Learning Research, 2022 - jmlr.org

While the identification of nonlinear dynamical systems is a fundamental building block of
model-based reinforcement learning and feedback control, its sample complexity is only …

Save Cite Cited by 61 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Estimating the mixing time of ergodic markov chains

G Wolfer, A Kontorovich - Conference on Learning Theory, 2019 - proceedings.mlr.press

We address the problem of estimating the mixing time $ t_ {\mathsf {mix}} $ of an arbitrary
ergodic finite Markov chain from a single trajectory of length $ m $. The reversible case was …

Save Cite Cited by 50 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A general framework for learning-based distributionally robust MPC of Markov jump systems

M Schuurmans, P Patrinos - IEEE Transactions on Automatic …, 2023 - ieeexplore.ieee.org

In this article, we present a data-driven learning model predictive control (MPC) scheme for
chance-constrained Markov jump systems with unknown switching probabilities. Using …

Save Cite Cited by 26 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Adaptive data analysis with correlated observations

A Kontorovich, M Sadigurschi… - … on Machine Learning, 2022 - proceedings.mlr.press

The vast majority of the work on adaptive data analysis focuses on the case where the
samples in the dataset are independent. Several approaches and tools have been …

Save Cite Cited by 11 Related articles All 7 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

Policy mirror descent for reinforcement learning: Linear convergence, new sampling complexity, and generalized problem classes

Learning with little mixing

Active learning for nonlinear system identification with guarantees

Safe, learning-based MPC for highway driving under lane-change uncertainty: A distributionally robust approach

Active learning for nonlinear system identification with guarantees

Estimating the mixing time of ergodic markov chains

A general framework for learning-based distributionally robust MPC of Markov jump systems

Adaptive data analysis with correlated observations