Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org
In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Non-stationary reinforcement learning without prior knowledge: An optimal black-box approach

CY Wei, H Luo - Conference on learning theory, 2021 - proceedings.mlr.press
We propose a black-box reduction that turns a certain reinforcement learning algorithm with
optimal regret in a (near-) stationary environment into another algorithm with optimal …

Corruption-robust exploration in episodic reinforcement learning

T Lykouris, M Simchowitz… - … on Learning Theory, 2021 - proceedings.mlr.press
We initiate the study of episodic reinforcement learning under adversarial corruptions in both
the rewards and the transition probabilities of the underlying system extending recent results …

Near-optimal model-free reinforcement learning in non-stationary episodic mdps

W Mao, K Zhang, R Zhu… - … on Machine Learning, 2021 - proceedings.mlr.press
We consider model-free reinforcement learning (RL) in non-stationary Markov decision
processes. Both the reward functions and the state transition functions are allowed to vary …

Provably efficient primal-dual reinforcement learning for cmdps with non-stationary objectives and constraints

Y Ding, J Lavaei - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov
decision processes (CMDPs) with non-stationary objectives and constraints, which plays a …

Dynamic regret of online markov decision processes

P Zhao, LF Li, ZH Zhou - International Conference on …, 2022 - proceedings.mlr.press
Abstract We investigate online Markov Decision Processes (MDPs) with adversarially
changing loss functions and known transitions. We choose dynamic regret as the …

Provably efficient model-free algorithms for non-stationary cmdps

H Wei, A Ghosh, N Shroff, L Ying… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We study model-free reinforcement learning (RL) algorithms in episodic non-stationary
constrained Markov decision processes (CMDPs), in which an agent aims to maximize the …

Non-stationary reinforcement learning under general function approximation

S Feng, M Yin, R Huang, YX Wang… - International …, 2023 - proceedings.mlr.press
General function approximation is a powerful tool to handle large state and action spaces in
a broad range of reinforcement learning (RL) scenarios. However, theoretical understanding …

Performative reinforcement learning

D Mandal, S Triantafyllou… - … Conference on Machine …, 2023 - proceedings.mlr.press
We introduce the framework of performative reinforcement learning where the policy chosen
by the learner affects the underlying reward and transition dynamics of the environment …

Efficient learning in non-stationary linear markov decision processes

A Touati, P Vincent - arxiv preprint arxiv:2010.12870, 2020 - arxiv.org
We study episodic reinforcement learning in non-stationary linear (aka low-rank) Markov
Decision Processes (MDPs), ie, both the reward and transition kernel are linear with respect …