A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arxiv preprint arxiv …, 2022 - arxiv.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

A review of safe reinforcement learning: Methods, theories and applications

S Gu, L Yang, Y Du, G Chen, F Walter… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

Last-iterate convergent policy gradient primal-dual methods for constrained mdps

D Ding, CY Wei, K Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …

A dual approach to constrained markov decision processes with entropy regularization

D Ying, Y Ding, J Lavaei - International Conference on …, 2022 - proceedings.mlr.press
We study entropy-regularized constrained Markov decision processes (CMDPs) under the
soft-max parameterization, in which an agent aims to maximize the entropy-regularized …

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

D Ding, K Zhang, J Duan, T Başar… - arxiv preprint arxiv …, 2022 - arxiv.org
We study sequential decision making problems aimed at maximizing the expected total
reward while satisfying a constraint on the expected total utility. We employ the natural policy …

Anchor-changing regularized natural policy gradient for multi-objective reinforcement learning

R Zhou, T Liu, D Kalathil… - Advances in Neural …, 2022 - proceedings.neurips.cc
We study policy optimization for Markov decision processes (MDPs) with multiple reward
value functions, which are to be jointly optimized according to given criteria such as …

Finding correlated equilibrium of constrained Markov game: A primal-dual approach

Z Chen, S Ma, Y Zhou - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Constrained Markov game is a fundamental problem that covers many applications, where
multiple players compete with each other under behavioral constraints. The existing …

Adaptive User Interface Generation Through Reinforcement Learning: A Data-Driven Approach to Personalization and Optimization

Q Sun, Y Xue, Z Song - arxiv preprint arxiv:2412.16837, 2024 - arxiv.org
This study introduces an adaptive user interface generation technology, emphasizing the
role of Human-Computer Interaction (HCI) in optimizing user experience. By focusing on …

Provably efficient generalized lagrangian policy optimization for safe multi-agent reinforcement learning

D Ding, X Wei, Z Yang, Z Wang… - Learning for dynamics …, 2023 - proceedings.mlr.press
We examine online safe multi-agent reinforcement learning using constrained Markov
games in which agents compete by maximizing their expected total rewards under a …

Stochastic optimization under hidden convexity

I Fatkhullin, N He, Y Hu - arxiv preprint arxiv:2401.00108, 2023 - arxiv.org
In this work, we consider constrained stochastic optimization problems under hidden
convexity, ie, those that admit a convex reformulation via non-linear (but invertible) map $ c …