Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms
Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …
the most important research directions in Reinforcement Learning (RL). This paper …
Nearly minimax optimal reinforcement learning for linear mixture markov decision processes
We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …
underlying transition probability kernel of the Markov decision process (MDP) is a linear …
Optimality and approximation with policy gradient methods in markov decision processes
Policy gradient (PG) methods are among the most effective methods in challenging
reinforcement learning problems with large state and/or action spaces. However, little is …
reinforcement learning problems with large state and/or action spaces. However, little is …
Information-theoretic considerations in batch reinforcement learning
Value-function approximation methods that operate in batch mode have foundational
importance to reinforcement learning (RL). Finite sample guarantees for these methods …
importance to reinforcement learning (RL). Finite sample guarantees for these methods …
Representation learning for online and offline rl in low-rank mdps
This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …
compact low-dimensional representation such that on top of the representation we can …
Nearly minimax optimal reinforcement learning for linear markov decision processes
We study reinforcement learning (RL) with linear function approximation. For episodic time-
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …
Learning near optimal policies with low inherent bellman error
We study the exploration problem with approximate linear action-value functions in episodic
reinforcement learning under the notion of low inherent Bellman error, a condition normally …
reinforcement learning under the notion of low inherent Bellman error, a condition normally …
Is a good representation sufficient for sample efficient reinforcement learning?
Modern deep learning methods provide effective means to learn good representations.
However, is a good representation itself sufficient for sample efficient reinforcement …
However, is a good representation itself sufficient for sample efficient reinforcement …
Provably efficient rl with rich observations via latent state decoding
We study the exploration problem in episodic MDPs with rich observations generated from a
small number of latent states. Under certain identifiability assumptions, we demonstrate how …
small number of latent states. Under certain identifiability assumptions, we demonstrate how …
Model-based rl in contextual decision processes: Pac bounds and exponential improvements over model-free approaches
We study the sample complexity of model-based reinforcement learning (henceforth RL) in
general contextual decision processes that require strategic exploration to find a near …
general contextual decision processes that require strategic exploration to find a near …