Exploration in deep reinforcement learning: A survey
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …
techniques are of primary importance when solving sparse reward problems. In sparse …
A distributional code for value in dopamine-based reinforcement learning
Since its introduction, the reward prediction error theory of dopamine has explained a wealth
of empirical phenomena, providing a unifying framework for understanding the …
of empirical phenomena, providing a unifying framework for understanding the …
Pessimistic bootstrap** for uncertainty-driven offline reinforcement learning
Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …
datasets without exploring the environment. Directly applying off-policy algorithms to offline …
Reinforcement learning, bit by bit
Reinforcement learning agents have demonstrated remarkable achievements in simulated
environments. Data efficiency poses an impediment to carrying this success over to real …
environments. Data efficiency poses an impediment to carrying this success over to real …
Exploration in deep reinforcement learning: From single-agent to multiagent domain
Deep reinforcement learning (DRL) and deep multiagent reinforcement learning (MARL)
have achieved significant success across a wide range of domains, including game artificial …
have achieved significant success across a wide range of domains, including game artificial …
[كتاب][B] Distributional reinforcement learning
MG Bellemare, W Dabney, M Rowland - 2023 - books.google.com
The first comprehensive guide to distributional reinforcement learning, providing a new
mathematical formalism for thinking about decisions from a probabilistic perspective …
mathematical formalism for thinking about decisions from a probabilistic perspective …
Estimating risk and uncertainty in deep reinforcement learning
Reinforcement learning agents are faced with two types of uncertainty. Epistemic uncertainty
stems from limited data and is useful for exploration, whereas aleatoric uncertainty arises …
stems from limited data and is useful for exploration, whereas aleatoric uncertainty arises …
Thompson sampling for improved exploration in gflownets
Generative flow networks (GFlowNets) are amortized variational inference algorithms that
treat sampling from a distribution over compositional objects as a sequential decision …
treat sampling from a distribution over compositional objects as a sequential decision …
Adaptive decision-making for automated vehicles under roundabout scenarios using optimization embedded reinforcement learning
The roundabout is a typical changeable, interactive scenario in which automated vehicles
should make adaptive and safe decisions. In this article, an optimization embedded …
should make adaptive and safe decisions. In this article, an optimization embedded …
DFAC framework: Factorizing the value function via quantile mixture for multi-agent distributional Q-learning
In fully cooperative multi-agent reinforcement learning (MARL) settings, the environments
are highly stochastic due to the partial observability of each agent and the continuously …
are highly stochastic due to the partial observability of each agent and the continuously …