Google Acadèmic

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arxiv preprint arxiv:2312.01058, 2023 - arxiv.org

Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …

Desa Cita Citat per 20 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

Offline reinforcement learning with behavior value regularization

L Huang, B Dong, W **e… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Offline reinforcement learning (offline RL) aims to find task-solving policies from prerecorded
datasets without online environment interaction. It is unfortunate that extrapolation errors can …

Desa Cita Citat per 6 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Hokoff: Real game dataset from honor of kings and its offline reinforcement learning benchmarks

Y Qu, B Wang, J Shao, Y Jiang… - Advances in …, 2023 - proceedings.neurips.cc

Abstract The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent
Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre …

Desa Cita Citat per 7 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LLM-empowered state representation for reinforcement learning

B Wang, Y Qu, Y Jiang, J Shao, C Liu, W Yang… - arxiv preprint arxiv …, 2024 - arxiv.org

Conventional state representations in reinforcement learning often omit critical task-related
details, presenting a significant challenge for value networks in establishing accurate …

Desa Cita Citat per 5 Articles relacionats Totes les 8 versions Free GPT-4 DeepSeek Versió HTML

Enhancing multi-scenario applicability of freeway variable speed limit control strategies using continual learning

R Zhang, S Xu, R Yu, J Yu - Accident Analysis & Prevention, 2024 - Elsevier

Variable speed limit (VSL) control benefits freeway operations through dynamic speed limit
adjustment strategies for specific operation scenarios, such as traffic jams, secondary crash …

Desa Cita Citat per 2 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Scenario-based Accelerated Testing for SOTIF in Autonomous Driving: A Review

L Tang, R Wang, Z Liu, Y Liang, Y Niu… - IEEE Internet of …, 2024 - ieeexplore.ieee.org

The development of intelligent driving systems has drawn significant attention to enhancing
the safety of autonomous vehicles and their intended functionality. Despite this, current …

Desa Cita Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Doubly mild generalization for offline reinforcement learning

Y Mao, Q Wang, Y Qu, Y Jiang, X Ji - arxiv preprint arxiv:2411.07934, 2024 - arxiv.org

Offline Reinforcement Learning (RL) suffers from the extrapolation error and value
overestimation. From a generalization perspective, this issue can be attributed to the over …

Desa Cita Citat per 2 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Z Liu, X Yang, S Sun, L Qian, L Wan… - Advances in Neural …, 2025 - proceedings.neurips.cc

Recent progress in generative models has stimulated significant innovations in many fields,
such as image generation and chatbots. Despite their success, these models often produce …

Desa Cita Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Theoretical investigations and practical enhancements on tail task risk minimization in meta learning

Y Lv, Q Wang, D Liang, Z **e - arxiv preprint arxiv:2410.22788, 2024 - arxiv.org

Meta learning is a promising paradigm in the era of large models and task distributional
robustness has become an indispensable consideration in real-world scenarios. Recent …

Desa Cita Citat per 1 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

ISFORS-MIX: Multi-agent reinforcement learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing network

J Rao, C Wang, M Liu, J Lei, W Giernacki - Knowledge-Based Systems, 2025 - Elsevier

In multi-agent reinforcement learning (MARL), the low quality of value function and the
estimation bias and variance in value function decomposition (VFD) are critical challenges …

Desa Cita Articles relacionats

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Counterfactual conservative q learning for offline multi-agent reinforcement learning

A survey of progress on cooperative multi-agent reinforcement learning in open environment

Offline reinforcement learning with behavior value regularization

Hokoff: Real game dataset from honor of kings and its offline reinforcement learning benchmarks

LLM-empowered state representation for reinforcement learning

Enhancing multi-scenario applicability of freeway variable speed limit control strategies using continual learning

Scenario-based Accelerated Testing for SOTIF in Autonomous Driving: A Review

Doubly mild generalization for offline reinforcement learning

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Theoretical investigations and practical enhancements on tail task risk minimization in meta learning

ISFORS-MIX: Multi-agent reinforcement learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing network