A survey on offline reinforcement learning: Taxonomy, review, and open problems
RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …
experienced a dramatic increase in popularity, scaling to previously intractable problems …
A survey on model-based reinforcement learning
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …
making problems via a trial-and-error approach. Errors are always undesirable in real-world …
A minimalist approach to offline reinforcement learning
Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …
Mildly conservative q-learning for offline reinforcement learning
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset
without continually interacting with the environment. The distribution shift between the …
without continually interacting with the environment. The distribution shift between the …
Critic regularized regression
Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy
optimization from large pre-recorded datasets without online environment interaction. It …
optimization from large pre-recorded datasets without online environment interaction. It …
Offline rl without off-policy evaluation
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …
critic approach involving off-policy evaluation. In this paper we show that simply doing one …
Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …
beginning to show some successes in real-world scenarios. However, much of the research …
Autonomous evaluation and refinement of digital agents
We show that domain-general automatic evaluators can significantly improve the
performance of agents for web navigation and device control. We experiment with multiple …
performance of agents for web navigation and device control. We experiment with multiple …
Offline reinforcement learning via high-fidelity generative behavior modeling
In offline reinforcement learning, weighted regression is a common method to ensure the
learned policy stays close to the behavior policy and to prevent selecting out-of-sample …
learned policy stays close to the behavior policy and to prevent selecting out-of-sample …
Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …
policy produces promising results. The Decision Transformer (DT) combines the conditional …