محقق Google

S Levine, A Kumar, G Tucker, J Fu - ar** error reduction‏

A Kumar, J Fu, M Soh, G Tucker… - Advances in neural …, 2019‏ - proceedings.neurips.cc‏

Off-policy reinforcement learning aims to leverage experience collected from prior policies
for sample-efficient learning. However, in practice, commonly used off-policy approximate …‏

ذخیره ارجاع بیان شده در 1209 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Advantage-weighted regression: Simple and scalable off-policy reinforcement learning‏

XB Peng, A Kumar, G Zhang, S Levine - arxiv preprint arxiv:1910.00177, 2019‏ - arxiv.org‏

In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that
uses standard supervised learning methods as subroutines. Our goal is an algorithm that …‏

ذخیره ارجاع بیان شده در 583 یافته مقاله‌های مربوط تمام نسخه‌های 7 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Revisiting fundamentals of experience replay‏

W Fedus, P Ramachandran… - International …, 2020‏ - proceedings.mlr.press‏

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but
there remain significant gaps in our understanding. We therefore present a systematic and …‏

ذخیره ارجاع بیان شده در 338 یافته مقاله‌های مربوط تمام نسخه‌های 12 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

When should we prefer offline reinforcement learning over behavioral cloning?‏

A Kumar, J Hong, A Singh, S Levine - arxiv preprint arxiv:2204.05618, 2022‏ - arxiv.org‏

Offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing
previously collected experience, without any online interaction. It is widely understood that …‏

ذخیره ارجاع بیان شده در 116 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Datasets and benchmarks for offline safe reinforcement learning‏

Z Liu, Z Guo, H Lin, Y Yao, J Zhu, Z Cen, H Hu… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

This paper presents a comprehensive benchmarking suite tailored to offline safe
reinforcement learning (RL) challenges, aiming to foster progress in the development and …‏

ذخیره ارجاع بیان شده در 42 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Diagnosing bottlenecks in deep q-learning algorithms

Offline reinforcement learning: Tutorial, review, and perspectives on open problems‏

Advantage-weighted regression: Simple and scalable off-policy reinforcement learning‏

Revisiting fundamentals of experience replay‏

When should we prefer offline reinforcement learning over behavioral cloning?‏

Datasets and benchmarks for offline safe reinforcement learning‏