Rl for latent mdps: Regret guarantees and a lower bound
In this work, we consider the regret minimization problem for reinforcement learning in latent
Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of …
Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of …
Learning mixtures of linear dynamical systems
Y Chen, HV Poor - International conference on machine …, 2022 - proceedings.mlr.press
We study the problem of learning a mixture of multiple linear dynamical systems (LDSs) from
unlabeled short sample trajectories, each generated by one of the LDS models. Despite the …
unlabeled short sample trajectories, each generated by one of the LDS models. Despite the …
Provably efficient multi-task reinforcement learning with model transfer
We study multi-task reinforcement learning (RL) in tabular episodic Markov decision
processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a …
processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a …
Reward-mixing mdps with few latent contexts are learnable
We consider episodic reinforcement learning in reward-mixing Markov decision processes
(RMMDPs): at the beginning of every episode nature randomly picks a latent reward model …
(RMMDPs): at the beginning of every episode nature randomly picks a latent reward model …
Sequential transfer in reinforcement learning with a generative model
We are interested in how to design reinforcement learning agents that provably reduce the
sample complexity for learning new tasks by transferring knowledge from previously-solved …
sample complexity for learning new tasks by transferring knowledge from previously-solved …
Temple: Learning template of transitions for sample efficient multi-task rl
Transferring knowledge among various environments is important for efficiently learning
multiple tasks online. Most existing methods directly use the previously learned models or …
multiple tasks online. Most existing methods directly use the previously learned models or …
Horizon-free and variance-dependent reinforcement learning for latent markov decision processes
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision
Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic …
Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic …
Near-Optimal Learning and Planning in Separated Latent MDPs
We study computational and statistical aspects of learning Latent Markov Decision
Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning …
Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning …
Bayesian residual policy optimization:: Scalable bayesian reinforcement learning with clairvoyant experts
Informed and robust decision making in the face of uncertainty is critical for robots operating
in unstructured environments. We formulate this as Bayesian Reinforcement Learning over …
in unstructured environments. We formulate this as Bayesian Reinforcement Learning over …
Statistical learning with latent variables: mixture models and reinforcement learning
J Kwon - 2022 - repositories.lib.utexas.edu
Statistical learning with missing or hidden information is ubiquitous in many practical
problems. For example, the success of a certain medical treatment can largely depend on …
problems. For example, the success of a certain medical treatment can largely depend on …