Google Učenjak

A modified random network distillation algorithm and its application in USVs naval battle simulation

J Rao, X Xu, H Bian, J Chen, Y Wang, J Lei… - Ocean …, 2022 - Elsevier

Unmanned surface vessel (USV) operations will change the future form of maritime wars
profoundly, and one of the critical factors for victory is the cluster intelligence of USVs …

Shrani Navedi Navedeno v 18 virih Sorodni članki Vse različice: 2

Balanced prioritized experience replay in off-policy reinforcement learning

Z Lou, Y Wang, S Shan, K Zhang, H Wei - Neural Computing and …, 2024 - Springer

Abstract In Off-Policy reinforcement learning (RL), the experience imbalance problem can
affect learning performance. The experience imbalance problem refers to the phenomenon …

Shrani Navedi Navedeno v 3 virih Sorodni članki Vse različice: 2

[免费ChatGPT] [DeepSeek可用网址] [PDF] mdpi.com

An AUV target-tracking method combining imitation learning and deep reinforcement learning

Y Mao, F Gao, Q Zhang, Z Yang - Journal of Marine Science and …, 2022 - mdpi.com

This study aims to solve the problem of sparse reward and local convergence when using a
reinforcement learning algorithm as the controller of an AUV. Based on the generative …

Shrani Navedi Navedeno v 19 virih Sorodni članki Vse različice: 5 Posnetek

Hierarchical reinforcement learning with unlimited option scheduling for sparse rewards in continuous spaces

Z Huang, Q Liu, F Zhu, L Zhang, L Wu - Expert Systems with Applications, 2024 - Elsevier

The fundamental concept behind option-based hierarchical reinforcement learning (O-HRL)
is to obtain temporal coarse-grained actions and abstract complex situations. Although O …

Shrani Navedi Navedeno v 5 virih Sorodni članki Vse različice: 2

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

M Zhang, K Chen, J Zhu - International Journal of Machine Learning and …, 2023 - Springer

Due to the complexity and uncertainty of the traffic, planning for autonomous driving (AD) on
highway is challenging. Traditional planning algorithms have the problems of low and …

Shrani Navedi Navedeno v 7 virih Sorodni članki Vse različice: 2

[免费ChatGPT] [DeepSeek可用网址] [PDF] google.com

Addressing hindsight bias in multigoal reinforcement learning

C Bai, L Wang, Y Wang, Z Wang… - IEEE Transactions …, 2021 - ieeexplore.ieee.org

Multigoal reinforcement learning (RL) extends the typical RL with goal-conditional value
functions and policies. One efficient multigoal RL algorithm is the hindsight experience …

Shrani Navedi Navedeno v 18 virih Sorodni članki Vse različice: 5

[免费ChatGPT] [DeepSeek可用网址] [PDF] researchcommons.org

Self-learning-based multiple spacecraft evasion decision making simulation under sparse reward condition

Z Yu, J Guo, Y Peng, C Bai - Journal of …, 2021 - dc-china-simulation …

In order to improve the ability of spacecraft formation to evade multiple interceptors, aiming
at the low success rate of traditional procedural maneuver evasion, a multi-agent …

Shrani Navedi Navedeno v 8 virih Sorodni članki Vse različice: 3 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] researchgate.net

Motion planning of space robot obstacle avoidance based on DDPG algorithm

H Sang, S Wang - 2022 International Conference on Service …, 2022 - ieeexplore.ieee.org

In order to solve the problem of unstructured environment and complex operation task of
space robot, this paper use DDPG algorithm which is data-driven and model free in the …

Shrani Navedi Navedeno v 5 virih Sorodni članki Vse različice: 3

[免费ChatGPT] [DeepSeek可用网址] [HTML] rhhz.net

[HTML][HTML] 化学稀疏奖励算法研究——理论与实验

杨瑞，严江鹏， **秀 - 智能系统学报, 2020 - html.rhhz.net

**年来, **化学**在游戏, 机器人控制等序列决策领域都获得了巨大的成功, 但是大量实际问题中
奖励信号十分稀疏, 导致智能体难以从与环境的交互中学**到最优的策略, 这一问题被称为稀疏 …

Shrani Navedi Navedeno v 7 virih Sorodni članki Vse različice: 3 Posnetek

[免费ChatGPT] [DeepSeek可用网址] [HTML] mdpi.com

[HTML][HTML] Simulation Training System for Parafoil Motion Controller Based on Actor–Critic RL Approach

X He, J Liu, J Zhao, R Xu, Q Liu, J Wan, G Yu - Actuators, 2024 - mdpi.com

The unique ram air aerodynamic shape and control rope pulling course of the parafoil
system make it difficult to realize its precise control. At present, the commonly used control …

Shrani Navedi Sorodni članki Vse različice: 3 Posnetek

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

A modified random network distillation algorithm and its application in USVs naval battle simulation

Balanced prioritized experience replay in off-policy reinforcement learning

An AUV target-tracking method combining imitation learning and deep reinforcement learning

Hierarchical reinforcement learning with unlimited option scheduling for sparse rewards in continuous spaces

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

Addressing hindsight bias in multigoal reinforcement learning

Self-learning-based multiple spacecraft evasion decision making simulation under sparse reward condition

Motion planning of space robot obstacle avoidance based on DDPG algorithm

[HTML][HTML] 化学稀疏奖励算法研究——理论与实验

[HTML][HTML] Simulation Training System for Parafoil Motion Controller Based on Actor–Critic RL Approach

[HTML][HTML] **化学**稀疏奖励算法研究——理论与实验

[HTML][HTML] Simulation Training System for Parafoil Motion Controller Based on Actor–Critic RL Approach

[HTML][HTML] 化学稀疏奖励算法研究——理论与实验