A survey of progress on cooperative multi-agent reinforcement learning in open environment

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arxiv preprint arxiv:2312.01058, 2023 - arxiv.org
Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …

Offline reinforcement learning with behavior value regularization

L Huang, B Dong, W **e… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Offline reinforcement learning (offline RL) aims to find task-solving policies from prerecorded
datasets without online environment interaction. It is unfortunate that extrapolation errors can …

Hokoff: Real game dataset from honor of kings and its offline reinforcement learning benchmarks

Y Qu, B Wang, J Shao, Y Jiang… - Advances in …, 2023 - proceedings.neurips.cc
Abstract The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent
Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre …

LLM-empowered state representation for reinforcement learning

B Wang, Y Qu, Y Jiang, J Shao, C Liu, W Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Conventional state representations in reinforcement learning often omit critical task-related
details, presenting a significant challenge for value networks in establishing accurate …

Enhancing multi-scenario applicability of freeway variable speed limit control strategies using continual learning

R Zhang, S Xu, R Yu, J Yu - Accident Analysis & Prevention, 2024 - Elsevier
Variable speed limit (VSL) control benefits freeway operations through dynamic speed limit
adjustment strategies for specific operation scenarios, such as traffic jams, secondary crash …

Scenario-based Accelerated Testing for SOTIF in Autonomous Driving: A Review

L Tang, R Wang, Z Liu, Y Liang, Y Niu… - IEEE Internet of …, 2024 - ieeexplore.ieee.org
The development of intelligent driving systems has drawn significant attention to enhancing
the safety of autonomous vehicles and their intended functionality. Despite this, current …

Doubly mild generalization for offline reinforcement learning

Y Mao, Q Wang, Y Qu, Y Jiang, X Ji - arxiv preprint arxiv:2411.07934, 2024 - arxiv.org
Offline Reinforcement Learning (RL) suffers from the extrapolation error and value
overestimation. From a generalization perspective, this issue can be attributed to the over …

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Z Liu, X Yang, S Sun, L Qian, L Wan… - Advances in Neural …, 2025 - proceedings.neurips.cc
Recent progress in generative models has stimulated significant innovations in many fields,
such as image generation and chatbots. Despite their success, these models often produce …

Theoretical investigations and practical enhancements on tail task risk minimization in meta learning

Y Lv, Q Wang, D Liang, Z **e - arxiv preprint arxiv:2410.22788, 2024 - arxiv.org
Meta learning is a promising paradigm in the era of large models and task distributional
robustness has become an indispensable consideration in real-world scenarios. Recent …

ISFORS-MIX: Multi-agent reinforcement learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing network

J Rao, C Wang, M Liu, J Lei, W Giernacki - Knowledge-Based Systems, 2025 - Elsevier
In multi-agent reinforcement learning (MARL), the low quality of value function and the
estimation bias and variance in value function decomposition (VFD) are critical challenges …