- Academic Search

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Gem Citer Citeret af 352 Relaterede artikler Alle 10 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - arxiv preprint arxiv:2110.04652, 2021 - arxiv.org

This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …

Gem Citer Citeret af 161 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pessimistic model-based offline reinforcement learning under partial coverage

M Uehara, W Sun - arxiv preprint arxiv:2107.06226, 2021 - arxiv.org

We study model-based offline Reinforcement Learning with general function approximation
without a full coverage assumption on the offline data distribution. We present an algorithm …

Gem Citer Citeret af 163 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Efficient reinforcement learning in block mdps: A model-free representation learning approach

X Zhang, Y Song, M Uehara, M Wang… - International …, 2022 - proceedings.mlr.press

We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision
Processes with block-structured dynamics (ie, Block MDPs), where rich observations are …

Gem Citer Citeret af 75 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Model-free representation learning and exploration in low-rank mdps

A Modi, J Chen, A Krishnamurthy, N Jiang… - Journal of Machine …, 2024 - jmlr.org

The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …

Gem Citer Citeret af 99 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Model-based rl with optimistic posterior sampling: Structural conditions and sample complexity

A Agarwal, T Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose a general framework to design posterior sampling methods for model-based
RL. We show that the proposed algorithms can be analyzed by reducing regret to Hellinger …

Gem Citer Citeret af 37 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provable benefits of representational transfer in reinforcement learning

A Agarwal, Y Song, W Sun, K Wang… - The Thirty Sixth …, 2023 - proceedings.mlr.press

We study the problem of representational transfer in RL, where an agent first pretrains in a
number of\emph {source tasks} to discover a shared representation, which is subsequently …

Gem Citer Citeret af 33 Relaterede artikler Alle 8 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Learning bellman complete representations for offline policy evaluation

J Chang, K Wang, N Kallus… - … Conference on Machine …, 2022 - proceedings.mlr.press

We study representation learning for Offline Reinforcement Learning (RL), focusing on the
important task of Offline Policy Evaluation (OPE). Recent work shows that, in contrast to …

Gem Citer Citeret af 15 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Model selection in batch policy optimization

J Lee, G Tucker, O Nachum… - … Conference on Machine …, 2022 - proceedings.mlr.press

We study the problem of model selection in batch policy optimization: given a fixed, partial-
feedback dataset and M model classes, learn a policy with performance that is competitive …

Gem Citer Citeret af 15 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Context-lumpable stochastic bandits

CW Lee, Q Liu, Y Abbasi Yadkori… - Advances in …, 2023 - proceedings.neurips.cc

We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …

Gem Citer Citeret af 2 Relaterede artikler Alle 8 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Provably efficient representation learning in low-rank markov decision processes

Towards continual reinforcement learning: A review and perspectives

Representation learning for online and offline rl in low-rank mdps

Pessimistic model-based offline reinforcement learning under partial coverage

Efficient reinforcement learning in block mdps: A model-free representation learning approach

Model-free representation learning and exploration in low-rank mdps

Model-based rl with optimistic posterior sampling: Structural conditions and sample complexity

Provable benefits of representational transfer in reinforcement learning

Learning bellman complete representations for offline policy evaluation

Model selection in batch policy optimization

Context-lumpable stochastic bandits