Google 학술 검색

J Schulman, P Moritz, S Levine, M Jordan… - ar**_Policy_Optimization/links/674fb7dd876bd177783b0769/Graph-Attention-Based-Casual-Discovery-With-Trust-Region-Navigated-Clip**-Policy-Optimization.pdf" data-clk="hl=ko&sa=T&oi=gga&ct=gga&cd=2&d=4215501129336400677&ei=Hb6wZ-HCIZ-bieoPz4O4mAk" data-clk-atid="JataRPF3gDoJ" target="_blank">[PDF] researchgate.net

[PDF][PDF] Trust Region Policy Optimization

J Schulman - arxiv preprint arxiv:1502.05477, 2015 - researchgate.net

In this article, we describe a method for optimizing control policies, with guaranteed
monotonic improvement. By making several approximations to the theoretically-justified …

저장 인용 9279회 인용 관련 학술자료 저장된 페이지

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep spatial autoencoders for visuomotor learning

C Finn, XY Tan, Y Duan, T Darrell… - … on Robotics and …, 2016 - ieeexplore.ieee.org

Reinforcement learning provides a powerful and flexible framework for automated
acquisition of robotic motion skills. However, applying reinforcement learning requires a …

저장 인용 677회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search

T Zhang, G Kahn, S Levine… - 2016 IEEE international …, 2016 - ieeexplore.ieee.org

Model predictive control (MPC) is an effective method for controlling robotic systems,
particularly autonomous aerial vehicles such as quadcopters. However, application of MPC …

저장 인용 577회 인용 관련 학술자료 전체 9개의 버전

[Free GPT-4]
[DeepSeek]

[HTML] acm.org

Reinforcement learning in robotics: A survey

J Kober, JA Bagnell, J Peters - The International Journal of …, 2013 - journals.sagepub.com

Reinforcement learning offers to robotics a framework and set of tools for the design of
sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic …

저장 인용 4560회 인용 관련 학술자료 전체 33개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Guided policy search

S Levine, V Koltun - International conference on machine …, 2013 - proceedings.mlr.press

Direct policy search can effectively scale to high-dimensional systems, but complex policies
with hundreds of parameters often present a challenge for such methods, requiring …

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

The optimal sample complexity of PAC learning

S Hanneke - Journal of Machine Learning Research, 2016 - jmlr.org

Policy search methods can allow robots to learn control policies for a wide range of tasks,
but practical applications of policy search often require hand-engineered components for …

저장 인용 174회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep reinforcement learning for tensegrity robot locomotion

M Zhang, X Geng, J Bruce, K Caluwaerts… - … on robotics and …, 2017 - ieeexplore.ieee.org

Tensegrity robots, composed of rigid rods connected by elastic cables, have a number of
unique properties that make them appealing for use as planetary exploration rovers …

저장 인용 131회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] escholarship.org

Optimizing expectations: From deep reinforcement learning to stochastic computation graphs

J Schulman - 2016 - escholarship.org

This thesis is mostly focused on reinforcement learning, which is viewed as an optimization
problem: maximize the expected total reward with respect to the parameters of the policy …

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Fast biped walking with a reflexive controller and real-time policy searching

High-dimensional continuous control using generalized advantage estimation

[PDF][PDF] Trust Region Policy Optimization

Deep spatial autoencoders for visuomotor learning

Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search

Reinforcement learning in robotics: A survey

Guided policy search

The optimal sample complexity of PAC learning

Deep reinforcement learning for tensegrity robot locomotion

Optimizing expectations: From deep reinforcement learning to stochastic computation graphs