Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

A distributional code for value in dopamine-based reinforcement learning

W Dabney, Z Kurth-Nelson, N Uchida, CK Starkweather… - Nature, 2020 - nature.com
Since its introduction, the reward prediction error theory of dopamine has explained a wealth
of empirical phenomena, providing a unifying framework for understanding the …

Pessimistic bootstrap** for uncertainty-driven offline reinforcement learning

C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu… - arxiv preprint arxiv …, 2022 - arxiv.org
Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …

Reinforcement learning, bit by bit

X Lu, B Van Roy, V Dwaracherla… - … and Trends® in …, 2023 - nowpublishers.com
Reinforcement learning agents have demonstrated remarkable achievements in simulated
environments. Data efficiency poses an impediment to carrying this success over to real …

Exploration in deep reinforcement learning: From single-agent to multiagent domain

J Hao, T Yang, H Tang, C Bai, J Liu… - … on Neural Networks …, 2023 - ieeexplore.ieee.org
Deep reinforcement learning (DRL) and deep multiagent reinforcement learning (MARL)
have achieved significant success across a wide range of domains, including game artificial …

[КНИГА][B] Distributional reinforcement learning

MG Bellemare, W Dabney, M Rowland - 2023 - books.google.com
The first comprehensive guide to distributional reinforcement learning, providing a new
mathematical formalism for thinking about decisions from a probabilistic perspective …

Estimating risk and uncertainty in deep reinforcement learning

WR Clements, B Van Delft, BM Robaglia… - arxiv preprint arxiv …, 2019 - arxiv.org
Reinforcement learning agents are faced with two types of uncertainty. Epistemic uncertainty
stems from limited data and is useful for exploration, whereas aleatoric uncertainty arises …

Thompson sampling for improved exploration in gflownets

J Rector-Brooks, K Madan, M Jain, M Korablyov… - arxiv preprint arxiv …, 2023 - arxiv.org
Generative flow networks (GFlowNets) are amortized variational inference algorithms that
treat sampling from a distribution over compositional objects as a sequential decision …

Adaptive decision-making for automated vehicles under roundabout scenarios using optimization embedded reinforcement learning

Y Zhang, B Gao, L Guo, H Guo… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
The roundabout is a typical changeable, interactive scenario in which automated vehicles
should make adaptive and safe decisions. In this article, an optimization embedded …

DFAC framework: Factorizing the value function via quantile mixture for multi-agent distributional Q-learning

WF Sun, CK Lee, CY Lee - International Conference on …, 2021 - proceedings.mlr.press
In fully cooperative multi-agent reinforcement learning (MARL) settings, the environments
are highly stochastic due to the partial observability of each agent and the continuously …