Google Академія

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

Зберегти Послатися Цитовано в 354 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Зберегти Послатися Цитовано в 2198 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

M Nakamoto, S Zhai, A Singh… - Advances in …, 2023 - proceedings.neurips.cc

A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization
from existing datasets followed by fast online fine-tuning with limited interaction. However …

Зберегти Послатися Цитовано в 119 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vip: Towards universal visual reward and representation via value-implicit pre-training

YJ Ma, S Sodhani, D Jayaraman, O Bastani… - arxiv preprint arxiv …, 2022 - arxiv.org

Reward and representation learning are two long-standing challenges for learning an
expanding set of robot manipulation skills from sensory observations. Given the inherent …

Зберегти Послатися Цитовано в 258 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A minimalist approach to offline reinforcement learning

S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …

Зберегти Послатися Цитовано в 900 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Offline reinforcement learning with fisher divergence critic regularization

I Kostrikov, R Fergus, J Tompson… - … on Machine Learning, 2021 - proceedings.mlr.press

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior
regularization, typically augmenting a model-free actor critic algorithm with a penalty …

Зберегти Послатися Цитовано в 340 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combo: Conservative offline model-based policy optimization

T Yu, A Kumar, R Rafailov… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) algorithms, which learn a dynamics
model from logged experience and perform conservative planning under the learned model …

Зберегти Послатися Цитовано в 456 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Зберегти Послатися Цитовано в 463 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Morel: Model-based offline reinforcement learning

R Kidambi, A Rajeswaran… - Advances in neural …, 2020 - proceedings.neurips.cc

In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based
solely on a dataset of historical interactions with the environment. This serves as an extreme …

Зберегти Послатися Цитовано в 791 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Mopo: Model-based offline policy optimization

T Yu, G Thomas, L Yu, S Ermon… - Advances in …, 2020 - proceedings.neurips.cc

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a
batch of previously collected data. This problem setting is compelling, because it offers the …

Зберегти Послатися Цитовано в 911 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Algaedice: Policy gradient from arbitrary experience

A survey on offline reinforcement learning: Taxonomy, review, and open problems

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

Vip: Towards universal visual reward and representation via value-implicit pre-training

A minimalist approach to offline reinforcement learning

Offline reinforcement learning with fisher divergence critic regularization

Combo: Conservative offline model-based policy optimization

Is pessimism provably efficient for offline rl?

Morel: Model-based offline reinforcement learning

Mopo: Model-based offline policy optimization