Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
A review of off-policy evaluation in reinforcement learning
Reinforcement learning (RL) is one of the most vibrant research frontiers in machine
learning and has been recently applied to solve a number of challenging problems. In this …
learning and has been recently applied to solve a number of challenging problems. In this …
Is pessimism provably efficient for offline rl?
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …
a dataset collected a priori. Due to the lack of further interactions with the environment …
Bellman-consistent pessimism for offline reinforcement learning
The use of pessimism, when reasoning about datasets lacking exhaustive exploration has
recently gained prominence in offline reinforcement learning. Despite the robustness it adds …
recently gained prominence in offline reinforcement learning. Despite the robustness it adds …
Adversarially trained actor critic for offline reinforcement learning
Abstract We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm
for offline reinforcement learning (RL) under insufficient data coverage, based on the …
for offline reinforcement learning (RL) under insufficient data coverage, based on the …
Offline reinforcement learning with realizability and single-policy concentrability
Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong
assumptions on both the function classes (eg, Bellman-completeness) and the data …
assumptions on both the function classes (eg, Bellman-completeness) and the data …
Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms
Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …
the most important research directions in Reinforcement Learning (RL). This paper …
A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic
This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization.
Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is …
Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is …
A theoretical analysis of deep Q-learning
Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …
foundation is less well understood. In this work, we make the first attempt to theoretically …
On the theory of policy gradient methods: Optimality, approximation, and distribution shift
Policy gradient methods are among the most effective methods in challenging reinforcement
learning problems with large state and/or action spaces. However, little is known about even …
learning problems with large state and/or action spaces. However, little is known about even …