Google Академик

Safe policy improvement with baseline bootstrap**

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Reinforcement learning algorithms: A brief survey

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

Сачувај Цитирај 225 пута наведен Сродни чланци Све верзије (2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Сачувај Цитирај 2218 пута наведен Сродни чланци Све верзије (3) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A minimalist approach to offline reinforcement learning

S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …

Сачувај Цитирај 911 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Conservative q-learning for offline reinforcement learning

A Kumar, A Zhou, G Tucker… - Advances in neural …, 2020 - proceedings.neurips.cc

Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …

Сачувај Цитирај 2110 пута наведен Сродни чланци Све верзије (10) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bellman-consistent pessimism for offline reinforcement learning

T **e, CA Cheng, N Jiang, P Mineiro… - Advances in neural …, 2021 - proceedings.neurips.cc

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has
recently gained prominence in offline reinforcement learning. Despite the robustness it adds …

Сачувај Цитирај 305 пута наведен Сродни чланци Све верзије (14) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combo: Conservative offline model-based policy optimization

T Yu, A Kumar, R Rafailov… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) algorithms, which learn a dynamics
model from logged experience and perform conservative planning under the learned model …

Сачувај Цитирај 464 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bridging offline reinforcement learning and imitation learning: A tale of pessimism

P Rashidinejad, B Zhu, C Ma, J Jiao… - Advances in Neural …, 2021 - proceedings.neurips.cc

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from
a fixed dataset without active data collection. Based on the composition of the offline dataset …

Сачувај Цитирај 331 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Сачувај Цитирај 466 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Morel: Model-based offline reinforcement learning

R Kidambi, A Rajeswaran… - Advances in neural …, 2020 - proceedings.neurips.cc

In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based
solely on a dataset of historical interactions with the environment. This serves as an extreme …

Сачувај Цитирај 797 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in neural …, 2023 - proceedings.neurips.cc

In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

Сачувај Цитирај 60 пута наведен Сродни чланци Све верзије (6) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Safe policy improvement with baseline bootstrap**

Reinforcement learning algorithms: A brief survey

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

A minimalist approach to offline reinforcement learning

Conservative q-learning for offline reinforcement learning

Bellman-consistent pessimism for offline reinforcement learning

Combo: Conservative offline model-based policy optimization

Bridging offline reinforcement learning and imitation learning: A tale of pessimism

Is pessimism provably efficient for offline rl?

Morel: Model-based offline reinforcement learning

For sale: State-action representation learning for deep reinforcement learning