- Academic Search

A Aubret, L Matignon, S Hassas - Entropy, 2023 - mdpi.com

The reinforcement learning (RL) research area is very active, with an important number of
new contributions, especially considering the emergent field of deep RL (DRL). However, a …

Speichern Zitieren Zitiert von: 47 Ähnliche Artikel Alle 10 Versionen Im Cache

[Free GPT-4]

[PDF] aaai.org

Exploration by maximizing Rényi entropy for reward-free RL framework

C Zhang, Y Cai, L Huang, J Li - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org

Exploration is essential for reinforcement learning (RL). To face the challenges of
exploration, we consider a reward-free RL framework that completely separates exploration …

Speichern Zitieren Zitiert von: 50 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] jmlr.org

Greedification operators for policy optimization: Investigating forward and reverse kl divergences

A Chan, H Silva, S Lim, T Kozuno, AR Mahmood… - Journal of Machine …, 2022 - jmlr.org

Approximate Policy Iteration (API) algorithms alternate between (approximate) policy
evaluation and (approximate) greedification. Many different approaches have been explored …

Speichern Zitieren Zitiert von: 33 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Effective Exploration Based on the Structural Information Principles

X Zeng, H Peng, A Li - ar** for efficient exploration in reinforcement learning

M Yuan, M Pun, D Wang, Y Chen, H Li - arxiv preprint arxiv:2107.08888, 2021 - arxiv.org

Maintaining the long-term exploration capability of the agent remains one of the critical
challenges in deep reinforcement learning. A representative solution is to leverage reward …

Speichern Zitieren Zitiert von: 9 Ähnliche Artikel Alle 2 Versionen HTML-Version

Towards Efficient Coordination of Power Distribution Network and Electric Vehicles: Deep Reinforcement Learning with Robust Reward Function

P Li, S Chen, Z Wei, Q Wu, G Sun… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org

The rapid increase in electric vehicle (EV) usage has led to an urgent need for coordinating
EV charging activities with power distribution networks (PDNs) to accommodate the resulting …

Speichern Zitieren Ähnliche Artikel

Multi-goal Reinforcement Learning via Exploring Entropy-regularized Successor Matching

X Feng, Y Zhou - IEEE Transactions on Games, 2023 - ieeexplore.ieee.org

Multigoal reinforcement learning (RL) algorithms tend to achieve and generalize over
diverse goals. However, unlike single-goal agents, multigoal agents struggle to break …

Speichern Zitieren Ähnliche Artikel

[Free GPT-4]

[PDF] openreview.net

Prioritizing Compression Explains Human Perceptual Preferences

FM Lopez, BE Shi, J Triesch - Intrinsically-Motivated and Open-Ended … - openreview.net

We present prioritized representation learning (PRL), a method to enhance unsupervised
representation learning by drawing inspiration from active learning and intrinsic motivations …

Speichern Zitieren Ähnliche Artikel HTML-Version

Data-Driven Sequential Decision Making by Understanding and Adopting Rational Behavior

KH Kim - 2023 - search.proquest.com

A remarkable feature of an intelligent agent is the ability to make sequences of smart
decisions that are executed in coordination to reach goals. As can be seen by watching …

Speichern Zitieren Ähnliche Artikel

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Entropy regularization with discounted future state distribution in policy gradient methods

An information-theoretic perspective on intrinsic motivation in reinforcement learning: A survey

Exploration by maximizing Rényi entropy for reward-free RL framework

Greedification operators for policy optimization: Investigating forward and reverse kl divergences

Effective Exploration Based on the Structural Information Principles

Towards Efficient Coordination of Power Distribution Network and Electric Vehicles: Deep Reinforcement Learning with Robust Reward Function

Multi-goal Reinforcement Learning via Exploring Entropy-regularized Successor Matching

Prioritizing Compression Explains Human Perceptual Preferences

Data-Driven Sequential Decision Making by Understanding and Adopting Rational Behavior