深度**化学**中稀疏奖励问题研究综述
杨惟轶, 白辰甲, 蔡超, 赵英男, 刘鹏 - 计算机科学, 2020 - joispormari.com
**化学**作为机器学**的重要分支, 是在与环境交互中寻找最优策略的一类方法.
**化学****年来与深度学**进行了广泛结合, 形成了深度**化学**的研究领域 …
**化学****年来与深度学**进行了广泛结合, 形成了深度**化学**的研究领域 …
[HTML][HTML] Self-adaptive priority correction for prioritized experience replay
H Zhang, C Qu, J Zhang, J Li - Applied sciences, 2020 - mdpi.com
Deep Reinforcement Learning (DRL) is a promising approach for general artificial
intelligence. However, most DRL methods suffer from the problem of data inefficiency. To …
intelligence. However, most DRL methods suffer from the problem of data inefficiency. To …
Prioritized experience replay method based on experience reward
J Gao, X Li, W Liu, J Zhao - 2021 International Conference on …, 2021 - ieeexplore.ieee.org
In recent years, artificial intelligence has been widely used in modern construction, and
reinforcement learning methods have played an important role in it. The experience replay …
reinforcement learning methods have played an important role in it. The experience replay …
Survivability-aware routing restoration mechanism for smart grid communication network in large-scale failures
Natural disasters such as earthquakes have consecutive impacts on the smart grid because
of aftershock activities. To guarantee service requirements and smart grid stable operations …
of aftershock activities. To guarantee service requirements and smart grid stable operations …
Task Analysis Methods Based on Deep Reinforcement Learning
X Gong, P Peng, L Rong… - Journal of …, 2024 - dc-china-simulation …
In response to the high coupling of task interaction and many influencing factors in task
analysis, a task analysis method based on sequence decoupling and deep reinforcement …
analysis, a task analysis method based on sequence decoupling and deep reinforcement …
Intelligent anti-jamming decision algorithm of bivariate frequency hop** pattern based on DQN with PER and Pareto
J Zhu, Z Zhao, S Zheng - … Journal of Information Technology and Web …, 2022 - igi-global.com
To improve the anti-jamming performance of frequency hop** system in complex
electromagnetic environment, a Deep Q-Network algorithm with priority experience replay …
electromagnetic environment, a Deep Q-Network algorithm with priority experience replay …
Research on decision making of intelligent vehicle based on composite priority experience replay
S Wang, B Zhang, Q Liang… - Intelligent Decision …, 2024 - journals.sagepub.com
To address the problems of underutilization of samples and unstable training for intelligent
vehicle training in the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, a …
vehicle training in the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, a …
基于深度**化学**的任务分析方法
龚雪, 彭鹏菲, 荣里, 郑雅莲, 姜俊 - 系统仿真学报, 2024 - china-simulation.com
针对任务分析中任务协同交互耦合度高, 影响因素繁多等问题, 提出了基于序列解耦与深度**化
学**的任务分析方法, 实现了复杂约束条件下的任务分解及任务序列重构. 设计了基于任务信息 …
学**的任务分析方法, 实现了复杂约束条件下的任务分解及任务序列重构. 设计了基于任务信息 …
未知环境下基于 PF-DQN 的无人机路径规划.
何金, 丁勇, 杨勇, 黄鑫城 - Ordnance Industry Automation, 2020 - search.ebscohost.com
为解决无人机无模型路径规划的问题, 提出一种环境信息未知情况下基于势函数(PF)
奖赏的DQN 路径规划方法. 建立无人机在环境中的连续状态空间, 将360 等分成若干个角度作为 …
奖赏的DQN 路径规划方法. 建立无人机在环境中的连续状态空间, 将360 等分成若干个角度作为 …
基于 Double Deep Q Network 的无人机隐蔽接敌策略
何金, 丁勇, 高振龙 - Electronics Optics & Control, 2020 - opticsjournal.net
摘要基于深度**化学**的连续状态空间无人机隐蔽接敌问题, 提出了基于马尔可夫决策过程的
隐蔽接敌双深度Q 网络(DDQN) 方法. 利用DDQN 生成目标值函数的方法解决了传统DQN …
隐蔽接敌双深度Q 网络(DDQN) 方法. 利用DDQN 生成目标值函数的方法解决了传统DQN …