Reinforcement learning: An overview
K Murphy - arxiv preprint arxiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …
learning and sequential decision making, covering value-based RL, policy-gradient …
Deep laplacian-based options for temporally-extended exploration
M Klissarov, MC Machado - arxiv preprint arxiv:2301.11181, 2023 - arxiv.org
Selecting exploratory actions that generate a rich stream of experience for better learning is
a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem …
a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem …
Timing as an Action: Learning When to Observe and Act
In standard reinforcement learning setups, the agent receives observations and performs
actions at evenly spaced intervals. However, in many real-world settings, observations are …
actions at evenly spaced intervals. However, in many real-world settings, observations are …
[HTML][HTML] Reward-respecting subtasks for model-based reinforcement learning
To achieve the ambitious goals of artificial intelligence, reinforcement learning must include
planning with a model of the world that is abstract in state and time. Deep learning has made …
planning with a model of the world that is abstract in state and time. Deep learning has made …
Reasoning with latent diffusion in offline reinforcement learning
Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies
from a static dataset, without the need for further environment interactions. However, a key …
from a static dataset, without the need for further environment interactions. However, a key …
Artificial general intelligence (AGI)-native wireless systems: A journey beyond 6G
Building future wireless systems that support services like digital twins (DTs) is challenging
to achieve through advances to conventional technologies like meta-surfaces. While artificial …
to achieve through advances to conventional technologies like meta-surfaces. While artificial …
Multi-step generalized policy improvement by leveraging approximate models
We introduce a principled method for performing zero-shot transfer in reinforcement learning
(RL) by exploiting approximate models of the environment. Zero-shot transfer in RL has …
(RL) by exploiting approximate models of the environment. Zero-shot transfer in RL has …
Provably (more) sample-efficient offline RL with options
The options framework yields empirical success in long-horizon planning problems of
reinforcement learning (RL). Recent works show that options help improve the sample …
reinforcement learning (RL). Recent works show that options help improve the sample …
Scenario-level knowledge transfer for motion planning of autonomous driving via successor representation
For autonomous vehicles, transfer learning can enhance performance by making better use
of previously learned knowledge in newly encountered scenarios, which holds great …
of previously learned knowledge in newly encountered scenarios, which holds great …
When does self-prediction help? understanding auxiliary tasks in reinforcement learning
We investigate the impact of auxiliary learning tasks such as observation reconstruction and
latent self-prediction on the representation learning problem in reinforcement learning. We …
latent self-prediction on the representation learning problem in reinforcement learning. We …