Reinforcement learning: An overview

K Murphy - arxiv preprint arxiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …

Deep laplacian-based options for temporally-extended exploration

M Klissarov, MC Machado - arxiv preprint arxiv:2301.11181, 2023 - arxiv.org
Selecting exploratory actions that generate a rich stream of experience for better learning is
a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem …

Timing as an Action: Learning When to Observe and Act

H Zhou, A Huang, K Azizzadenesheli… - International …, 2024 - proceedings.mlr.press
In standard reinforcement learning setups, the agent receives observations and performs
actions at evenly spaced intervals. However, in many real-world settings, observations are …

[HTML][HTML] Reward-respecting subtasks for model-based reinforcement learning

RS Sutton, MC Machado, GZ Holland, D Szepesvari… - Artificial Intelligence, 2023 - Elsevier
To achieve the ambitious goals of artificial intelligence, reinforcement learning must include
planning with a model of the world that is abstract in state and time. Deep learning has made …

Reasoning with latent diffusion in offline reinforcement learning

S Venkatraman, S Khaitan, RT Akella, J Dolan… - arxiv preprint arxiv …, 2023 - arxiv.org
Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies
from a static dataset, without the need for further environment interactions. However, a key …

Artificial general intelligence (AGI)-native wireless systems: A journey beyond 6G

W Saad, O Hashash, CK Thomas, C Chaccour… - arxiv preprint arxiv …, 2024 - arxiv.org
Building future wireless systems that support services like digital twins (DTs) is challenging
to achieve through advances to conventional technologies like meta-surfaces. While artificial …

Multi-step generalized policy improvement by leveraging approximate models

LN Alegre, A Bazzan, A Nowé… - Advances in Neural …, 2024 - proceedings.neurips.cc
We introduce a principled method for performing zero-shot transfer in reinforcement learning
(RL) by exploiting approximate models of the environment. Zero-shot transfer in RL has …

Provably (more) sample-efficient offline RL with options

X Hu, H Leung - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc
The options framework yields empirical success in long-horizon planning problems of
reinforcement learning (RL). Recent works show that options help improve the sample …

Scenario-level knowledge transfer for motion planning of autonomous driving via successor representation

H Lu, C Lu, H Wang, J Gong, M Zhu, H Yang - Transportation Research Part …, 2024 - Elsevier
For autonomous vehicles, transfer learning can enhance performance by making better use
of previously learned knowledge in newly encountered scenarios, which holds great …

When does self-prediction help? understanding auxiliary tasks in reinforcement learning

C Voelcker, T Kastner, I Gilitschenski… - arxiv preprint arxiv …, 2024 - arxiv.org
We investigate the impact of auxiliary learning tasks such as observation reconstruction and
latent self-prediction on the representation learning problem in reinforcement learning. We …