Bilinear classes: A structural framework for provable generalization in rl
Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …
generalization in reinforcement learning in a wide variety of settings through the use of …
Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms
Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …
the most important research directions in Reinforcement Learning (RL). This paper …
Nearly minimax optimal reinforcement learning for linear mixture markov decision processes
We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …
underlying transition probability kernel of the Markov decision process (MDP) is a linear …
Nearly minimax optimal reinforcement learning for linear markov decision processes
We study reinforcement learning (RL) with linear function approximation. For episodic time-
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …
Flambe: Structural complexity and representation learning of low rank mdps
In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common
practice to make parametric assumptions where values or policies are functions of some low …
practice to make parametric assumptions where values or policies are functions of some low …
Learning near optimal policies with low inherent bellman error
We study the exploration problem with approximate linear action-value functions in episodic
reinforcement learning under the notion of low inherent Bellman error, a condition normally …
reinforcement learning under the notion of low inherent Bellman error, a condition normally …
Unpacking reward sha**: Understanding the benefits of reward engineering on sample complexity
The success of reinforcement learning in a variety of challenging sequential decision-
making problems has been much discussed, but often ignored in this discussion is the …
making problems has been much discussed, but often ignored in this discussion is the …
The role of coverage in online reinforcement learning
Coverage conditions--which assert that the data logging distribution adequately covers the
state space--play a fundamental role in determining the sample complexity of offline …
state space--play a fundamental role in determining the sample complexity of offline …
Reinforcement learning with general value function approximation: Provably efficient approach via bounded eluder dimension
Value function approximation has demonstrated phenomenal empirical success in
reinforcement learning (RL). Nevertheless, despite a handful of recent progress on …
reinforcement learning (RL). Nevertheless, despite a handful of recent progress on …
Reward-free rl is no harder than reward-aware rl in linear markov decision processes
Reward-free reinforcement learning (RL) considers the setting where the agent does not
have access to a reward function during exploration, but must propose a near-optimal policy …
have access to a reward function during exploration, but must propose a near-optimal policy …