A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Decision transformer: Reinforcement learning via sequence modeling

L Chen, K Lu, A Rajeswaran, K Lee… - Advances in neural …, 2021 - proceedings.neurips.cc
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …

The dormant neuron phenomenon in deep reinforcement learning

G Sokar, R Agarwal, PS Castro… - … Conference on Machine …, 2023 - proceedings.mlr.press
In this work we identify the dormant neuron phenomenon in deep reinforcement learning,
where an agent's network suffers from an increasing number of inactive neurons, thereby …

Offline reinforcement learning as one big sequence modeling problem

M Janner, Q Li, S Levine - Advances in neural information …, 2021 - proceedings.neurips.cc
Reinforcement learning (RL) is typically viewed as the problem of estimating single-step
policies (for model-free RL) or single-step models (for model-based RL), leveraging the …

Causal machine learning: A survey and open problems

J Kaddour, A Lynch, Q Liu, MJ Kusner… - arxiv preprint arxiv …, 2022 - arxiv.org
Causal Machine Learning (CausalML) is an umbrella term for machine learning methods
that formalize the data-generation process as a structural causal model (SCM). This …

Generative skill chaining: Long-horizon skill planning with diffusion models

UA Mishra, S Xue, Y Chen… - Conference on Robot …, 2023 - proceedings.mlr.press
Long-horizon tasks, usually characterized by complex subtask dependencies, present a
significant challenge in manipulation planning. Skill chaining is a practical approach to …

Masked world models for visual control

Y Seo, D Hafner, H Liu, F Liu, S James… - … on Robot Learning, 2023 - proceedings.mlr.press
Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient
robot learning from visual observations. Yet the current approaches typically train a single …

Temporal difference learning for model predictive control

N Hansen, X Wang, H Su - arxiv preprint arxiv:2203.04955, 2022 - arxiv.org
Data-driven model predictive control has two key advantages over model-free methods: a
potential for improved sample efficiency through model learning, and better performance as …

Synthetic experience replay

C Lu, P Ball, YW Teh… - Advances in Neural …, 2023 - proceedings.neurips.cc
A key theme in the past decade has been that when large neural networks and large
datasets combine they can produce remarkable results. In deep reinforcement learning (RL) …