- Academic Search

Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning

E Salvato, G Fenu, E Medvet, FA Pellegrino - IEEE Access, 2021 - ieeexplore.ieee.org

The growing demand for robots able to act autonomously in complex scenarios has widely
accelerated the introduction of Reinforcement Learning (RL) in robots control applications …

保存引用被引用次数：181 相关文章所有 4 个版本

[Free GPT-4]

[PDF] arxiv.org

A tour of reinforcement learning: The view from continuous control

B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org

This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …

保存引用被引用次数：799 相关文章所有 5 个版本

[Free GPT-4]

[PDF] mlr.press

Global convergence of policy gradient methods for the linear quadratic regulator

M Fazel, R Ge, S Kakade… - … conference on machine …, 2018 - proceedings.mlr.press

Direct policy gradient methods for reinforcement learning and continuous control problems
are a popular approach for a variety of reasons: 1) they are easy to implement without …

保存引用被引用次数：722 相关文章所有 9 个版本 HTML 版

[Free GPT-4]

[PDF] mlr.press

A finite time analysis of temporal difference learning with linear function approximation

J Bhandari, D Russo, R Singal - Conference on learning …, 2018 - proceedings.mlr.press

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value
function corresponding to a given policy in a Markov decision process. Although TD is one of …

保存引用被引用次数：442 相关文章所有 11 个版本 HTML 版

[Free GPT-4]

[PDF] sciencedirect.com

Projection-based model reduction: Formulations for physics-based machine learning

R Swischuk, L Mainini, B Peherstorfer, K Willcox - Computers & Fluids, 2019 - Elsevier

This paper considers the creation of parametric surrogate models for applications in science
and engineering where the goal is to predict high-dimensional output quantities of interest …

保存引用被引用次数：380 相关文章所有 5 个版本

[Free GPT-4]

[PDF] arxiv.org

Simple random search provides a competitive approach to reinforcement learning

H Mania, A Guy, B Recht - arxiv preprint arxiv:1803.07055, 2018 - arxiv.org

A common belief in model-free reinforcement learning is that methods based on random
search in the parameter space of policies exhibit significantly worse sample complexity than …

保存引用被引用次数：410 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] neurips.cc

Simple random search of static linear policies is competitive for reinforcement learning

H Mania, A Guy, B Recht - Advances in neural information …, 2018 - proceedings.neurips.cc

Abstract Model-free reinforcement learning aims to offer off-the-shelf solutions for controlling
dynamical systems without requiring models of the system dynamics. We introduce a model …

保存引用被引用次数：319 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] mlr.press

Naive exploration is optimal for online lqr

M Simchowitz, D Foster - International Conference on …, 2020 - proceedings.mlr.press

We consider the problem of online adaptive control of the linear quadratic regulator, where
the true system parameters are unknown. We prove new upper and lower bounds …

保存引用被引用次数：229 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] neurips.cc

Regret bounds for robust adaptive control of the linear quadratic regulator

S Dean, H Mania, N Matni… - Advances in Neural …, 2018 - proceedings.neurips.cc

We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown
linear system is controlled subject to quadratic costs. Leveraging recent developments in the …

保存引用被引用次数：322 相关文章所有 7 个版本 HTML 版

[Free GPT-4]

[PDF] jmlr.org

Derivative-free methods for policy optimization: Guarantees for linear quadratic systems

D Malik, A Pananjady, K Bhatia, K Khamaru… - Journal of Machine …, 2020 - jmlr.org

We study derivative-free methods for policy optimization over the class of linear policies. We
focus on characterizing the convergence rate of these methods when applied to linear …

保存引用被引用次数：232 相关文章所有 10 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Least-squares temporal difference learning for the linear quadratic regulator

Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning

A tour of reinforcement learning: The view from continuous control

Global convergence of policy gradient methods for the linear quadratic regulator

A finite time analysis of temporal difference learning with linear function approximation

Projection-based model reduction: Formulations for physics-based machine learning

Simple random search provides a competitive approach to reinforcement learning

Simple random search of static linear policies is competitive for reinforcement learning

Naive exploration is optimal for online lqr

Regret bounds for robust adaptive control of the linear quadratic regulator

Derivative-free methods for policy optimization: Guarantees for linear quadratic systems