Google Tudós

B Eysenbach, S Levine - arxiv preprint arxiv:2103.06257, 2021 - arxiv.org

Many potential applications of reinforcement learning (RL) require guarantees that the agent
will perform well in the face of disturbances to the dynamics or reward function. In this paper …

Mentés Hivatkozás Idézetek száma: 211 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Policy gradient bayesian robust optimization for imitation learning

Z Javed, DS Brown, S Sharma, J Zhu… - International …, 2021 - proceedings.mlr.press

The difficulty in specifying rewards for many real-world problems has led to an increased
focus on learning rewards from human feedback, such as demonstrations. However, there …

Mentés Hivatkozás Idézetek száma: 25 Kapcsolódó cikkek Mind a(z) 8 változat HTML-változat

Incorporating convex risk measures into multistage stochastic programming algorithms

O Dowson, DP Morton, BK Pagnoncelli - Annals of Operations Research, 2022 - Springer

Over the last two decades, coherent risk measures have been well studied as a principled,
axiomatic way to characterize the risk of a random variable. Because of this axiomatic …

Mentés Hivatkozás Idézetek száma: 7 Kapcsolódó cikkek Mind a(z) 2 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Where2Start: Leveraging initial States for Robust and Sample-Efficient Reinforcement Learning

P Parsa, RZ Moayedi, M Bornosi, MM Bejani - arxiv preprint arxiv …, 2023 - arxiv.org

The reinforcement learning algorithms that focus on how to compute the gradient and
choose next actions, are effectively improved the performance of the agents. However, these …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[IDÉZET][C] Robust Imitation Learning for Risk-Aware Behavior and Sim2Real Transfer

Z Javed - 2022

Mentés Hivatkozás Kapcsolódó cikkek

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Entropic risk constrained soft-robust policy optimization

Maximum entropy RL (provably) solves some robust RL problems

Policy gradient bayesian robust optimization for imitation learning

Incorporating convex risk measures into multistage stochastic programming algorithms

Where2Start: Leveraging initial States for Robust and Sample-Efficient Reinforcement Learning

[IDÉZET][C] Robust Imitation Learning for Risk-Aware Behavior and Sim2Real Transfer