Dynamic Policy Decision/Enforcement Security Zoning Through Stochastic Games and Meta Learning

Y Bello, AR Hussein - IEEE Transactions on Network and …, 2024 - ieeexplore.ieee.org
Securing Next Generation Networks (NGNs) remains a prominent topic of discussion in
academia and industries alike, driven by the rapid evolution of cyber attacks. As these …

Absolute Policy Optimization

W Zhao, F Li, Y Sun, R Chen, T Wei, C Liu - arxiv preprint arxiv …, 2023 - arxiv.org
In recent years, trust region on-policy reinforcement learning has achieved impressive
results in addressing complex control tasks and gaming scenarios. However, contemporary …

Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline

W Meng, Q Zheng, L Yang, Y Yin, G Pan - arxiv preprint arxiv:2405.02572, 2024 - arxiv.org
Policy-based methods have achieved remarkable success in solving challenging
reinforcement learning problems. Among these methods, off-policy policy gradient methods …

Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence

W Zhao, F Li, Y Sun, R Chen, T Wei, C Liu - Forty-first International … - openreview.net
In recent years, trust region on-policy reinforcement learning has achieved impressive
results in addressing complex control tasks and gaming scenarios. However, contemporary …