Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org
In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - arxiv preprint arxiv:2110.04652, 2021 - arxiv.org
This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …

Pessimistic model-based offline reinforcement learning under partial coverage

M Uehara, W Sun - arxiv preprint arxiv:2107.06226, 2021 - arxiv.org
We study model-based offline Reinforcement Learning with general function approximation
without a full coverage assumption on the offline data distribution. We present an algorithm …

Efficient reinforcement learning in block mdps: A model-free representation learning approach

X Zhang, Y Song, M Uehara, M Wang… - International …, 2022 - proceedings.mlr.press
We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision
Processes with block-structured dynamics (ie, Block MDPs), where rich observations are …

Model-free representation learning and exploration in low-rank mdps

A Modi, J Chen, A Krishnamurthy, N Jiang… - Journal of Machine …, 2024 - jmlr.org
The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …

Model-based rl with optimistic posterior sampling: Structural conditions and sample complexity

A Agarwal, T Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We propose a general framework to design posterior sampling methods for model-based
RL. We show that the proposed algorithms can be analyzed by reducing regret to Hellinger …

Provable benefits of representational transfer in reinforcement learning

A Agarwal, Y Song, W Sun, K Wang… - The Thirty Sixth …, 2023 - proceedings.mlr.press
We study the problem of representational transfer in RL, where an agent first pretrains in a
number of\emph {source tasks} to discover a shared representation, which is subsequently …

Learning bellman complete representations for offline policy evaluation

J Chang, K Wang, N Kallus… - … Conference on Machine …, 2022 - proceedings.mlr.press
We study representation learning for Offline Reinforcement Learning (RL), focusing on the
important task of Offline Policy Evaluation (OPE). Recent work shows that, in contrast to …

Model selection in batch policy optimization

J Lee, G Tucker, O Nachum… - … Conference on Machine …, 2022 - proceedings.mlr.press
We study the problem of model selection in batch policy optimization: given a fixed, partial-
feedback dataset and M model classes, learn a policy with performance that is competitive …

Context-lumpable stochastic bandits

CW Lee, Q Liu, Y Abbasi Yadkori… - Advances in …, 2023 - proceedings.neurips.cc
We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …