Rl for latent mdps: Regret guarantees and a lower bound

J Kwon, Y Efroni, C Caramanis… - Advances in Neural …, 2021 - proceedings.neurips.cc
In this work, we consider the regret minimization problem for reinforcement learning in latent
Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of …

Learning mixtures of linear dynamical systems

Y Chen, HV Poor - International conference on machine …, 2022 - proceedings.mlr.press
We study the problem of learning a mixture of multiple linear dynamical systems (LDSs) from
unlabeled short sample trajectories, each generated by one of the LDS models. Despite the …

Provably efficient multi-task reinforcement learning with model transfer

C Zhang, Z Wang - Advances in Neural Information …, 2021 - proceedings.neurips.cc
We study multi-task reinforcement learning (RL) in tabular episodic Markov decision
processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a …

Reward-mixing mdps with few latent contexts are learnable

J Kwon, Y Efroni, C Caramanis… - … on Machine Learning, 2023 - proceedings.mlr.press
We consider episodic reinforcement learning in reward-mixing Markov decision processes
(RMMDPs): at the beginning of every episode nature randomly picks a latent reward model …

Sequential transfer in reinforcement learning with a generative model

A Tirinzoni, R Poiani, M Restelli - … Conference on Machine …, 2020 - proceedings.mlr.press
We are interested in how to design reinforcement learning agents that provably reduce the
sample complexity for learning new tasks by transferring knowledge from previously-solved …

Temple: Learning template of transitions for sample efficient multi-task rl

Y Sun, X Yin, F Huang - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Transferring knowledge among various environments is important for efficiently learning
multiple tasks online. Most existing methods directly use the previously learned models or …

Horizon-free and variance-dependent reinforcement learning for latent markov decision processes

R Zhou, R Wang, SS Du - International Conference on …, 2023 - proceedings.mlr.press
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision
Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic …

Near-Optimal Learning and Planning in Separated Latent MDPs

F Chen, C Daskalakis, N Golowich… - arxiv preprint arxiv …, 2024 - arxiv.org
We study computational and statistical aspects of learning Latent Markov Decision
Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning …

Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant experts

G Lee, B Hou, S Choudhury… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org
Informed and robust decision making in the face of uncertainty is critical for robots operating
in unstructured environments. We formulate this as Bayesian Reinforcement Learning over …

Statistical learning with latent variables: mixture models and reinforcement learning

J Kwon - 2022 - repositories.lib.utexas.edu
Statistical learning with missing or hidden information is ubiquitous in many practical
problems. For example, the success of a certain medical treatment can largely depend on …