Relay hindsight experience replay: Self-guided continual reinforcement learning for sequential object manipulation tasks with sparse rewards

Y Luo, Y Wang, K Dong, Q Zhang, E Cheng, Z Sun… - Neurocomputing, 2023‏ - Elsevier
Learning with sparse rewards remains a challenging problem in reinforcement learning
(RL). In particular, for sequential object manipulation tasks, the RL agent generally only …

Reinforcement learning control of hydraulic servo system based on TD3 algorithm

X Yuan, Y Wang, R Zhang, Q Gao, Z Zhou, R Zhou… - Machines, 2022‏ - mdpi.com
This paper aims at the characteristics of nonlinear, time-varying and parameter coupling in a
hydraulic servo system. An intelligent control method is designed that uses self-learning …

Addressing hindsight bias in multigoal reinforcement learning

C Bai, L Wang, Y Wang, Z Wang… - IEEE Transactions …, 2021‏ - ieeexplore.ieee.org
Multigoal reinforcement learning (RL) extends the typical RL with goal-conditional value
functions and policies. One efficient multigoal RL algorithm is the hindsight experience …

Regularly updated deterministic policy gradient algorithm

S Han, W Zhou, S Lü, J Yu - Knowledge-Based Systems, 2021‏ - Elsevier
Abstract Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known
reinforcement learning methods. However, this method is inefficient and unstable in practical …

Variational dynamic for self-supervised exploration in deep reinforcement learning

C Bai, P Liu, K Liu, L Wang, Y Zhao… - IEEE Transactions on …, 2021‏ - ieeexplore.ieee.org
Efficient exploration remains a challenging problem in reinforcement learning, especially for
tasks where extrinsic rewards from environments are sparse or even totally disregarded …

An Intelligent Strategy Decision Method for Collaborative Jamming Based On Hierarchical Multi-Agent Reinforcement Learning

W Zhang, T Zhao, Z Zhao, Y Wang… - IEEE Transactions on …, 2024‏ - ieeexplore.ieee.org
Aiming at the problem of intelligent cooperative jamming decision-making against frequency
agility and frequency diversity in cognitive electronic warfare, an intelligent cooperative …

Anticipatory Classifier System With Episode-Based Experience Replay

Ł Smierzchała, N Kozłowski, O Unold - IEEE Access, 2023‏ - ieeexplore.ieee.org
Deep reinforcement learning with Experience Replay (ER), including Deep Q-Network
(DQN), has been used to solve many multi-step learning problems. However, in practice …

Long-Term Feature Extraction Via Frequency Prediction for Efficient Reinforcement Learning

J Wang, M Ye, Y Kuang, R Yang… - … on Pattern Analysis …, 2025‏ - ieeexplore.ieee.org
Sample efficiency remains a key challenge for the deployment of deep reinforcement
learning (RL) in real-world scenarios. A common approach is to learn efficient …

Prioritized hindsight with dual buffer for meta-reinforcement learning

SW Beyene, JH Han - Electronics, 2022‏ - mdpi.com
Sharing prior knowledge across multiple robotic manipulation tasks is a challenging
research topic. Although the state-of-the-art deep reinforcement learning (DRL) algorithms …

Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning

R Li, Z Cai, T Huang, W Zhu - Knowledge-based systems, 2021‏ - Elsevier
Hierarchical reinforcement learning (HRL) extends traditional reinforcement learning
methods to complex tasks, such as the continuous control task with long horizon. As an …