- Academic Search

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Tallenna Viittaa Viittausten määrä 2195 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Neural approaches to conversational AI

J Gao, M Galley, L Li - The 41st international ACM SIGIR conference on …, 2018 - dl.acm.org

This tutorial surveys neural approaches to conversational AI that were developed in the last
few years. We group conversational systems into three categories:(1) question answering …

Tallenna Viittaa Viittausten määrä 919 Aiheeseen liittyviä artikkeleita Kaikki 16 versiota

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Conservative q-learning for offline reinforcement learning

A Kumar, A Zhou, G Tucker… - Advances in neural …, 2020 - proceedings.neurips.cc

Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …

Tallenna Viittaa Viittausten määrä 2082 Aiheeseen liittyviä artikkeleita Kaikki 10 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Tallenna Viittaa Viittausten määrä 461 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Behavior regularized offline reinforcement learning

Y Wu, G Tucker, O Nachum - arxiv preprint arxiv:1911.11361, 2019 - arxiv.org

In reinforcement learning (RL) research, it is common to assume access to direct online
interactions with the environment. However in many real-world applications, access to the …

Tallenna Viittaa Viittausten määrä 806 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Offline reinforcement learning with realizability and single-policy concentrability

W Zhan, B Huang, A Huang… - … on Learning Theory, 2022 - proceedings.mlr.press

Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong
assumptions on both the function classes (eg, Bellman-completeness) and the data …

Tallenna Viittaa Viittausten määrä 131 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Information-theoretic considerations in batch reinforcement learning

J Chen, N Jiang - International conference on machine …, 2019 - proceedings.mlr.press

Value-function approximation methods that operate in batch mode have foundational
importance to reinforcement learning (RL). Finite sample guarantees for these methods …

Tallenna Viittaa Viittausten määrä 434 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Dualdice: Behavior-agnostic estimation of discounted stationary distribution corrections

O Nachum, Y Chow, B Dai, L Li - Advances in neural …, 2019 - proceedings.neurips.cc

In many real-world reinforcement learning applications, access to the environment is limited
to a fixed dataset, instead of direct (online) interaction with the environment. When using this …

Tallenna Viittaa Viittausten määrä 383 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A review of off-policy evaluation in reinforcement learning

M Uehara, C Shi, N Kallus - arxiv preprint arxiv:2212.06355, 2022 - arxiv.org

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine
learning and has been recently applied to solve a number of challenging problems. In this …

Tallenna Viittaa Viittausten määrä 78 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Batch policy learning under constraints

H Le, C Voloshin, Y Yue - International Conference on …, 2019 - proceedings.mlr.press

When learning policies for real-world domains, two important questions arise:(i) how to
efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate …

Tallenna Viittaa Viittausten määrä 370 Aiheeseen liittyviä artikkeleita Kaikki 14 versiota HTML-versio

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Breaking the curse of horizon: Infinite-horizon off-policy estimation

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

Neural approaches to conversational AI

Conservative q-learning for offline reinforcement learning

Is pessimism provably efficient for offline rl?

Behavior regularized offline reinforcement learning

Offline reinforcement learning with realizability and single-policy concentrability

Information-theoretic considerations in batch reinforcement learning

Dualdice: Behavior-agnostic estimation of discounted stationary distribution corrections

A review of off-policy evaluation in reinforcement learning

Batch policy learning under constraints