Large sequence models for sequential decision-making: a survey

M Wen, R Lin, H Wang, Y Yang, Y Wen, L Mai… - Frontiers of Computer …, 2023‏ - Springer
Transformer architectures have facilitated the development of large-scale and general-
purpose sequence models for prediction tasks in natural language processing and computer …

Ace: Cooperative multi-agent q-learning with bidirectional action-dependency

C Li, J Liu, Y Zhang, Y Wei, Y Niu, Y Yang… - Proceedings of the …, 2023‏ - ojs.aaai.org
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which
is the ever-changing targets at every iteration when multiple agents update their policies at …

Flexible ran slicing in open ran with constrained multi-agent reinforcement learning

M Zangooei, M Golkarifard, M Rouili… - IEEE Journal on …, 2023‏ - ieeexplore.ieee.org
Network slicing enables the provision of customized services in next-generation mobile
networks. Accordingly, the network is divided into logically isolated networks that share …

Is centralized training with decentralized execution framework centralized enough for MARL?

Y Zhou, S Liu, Y Qing, K Chen, T Zheng… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Centralized Training with Decentralized Execution (CTDE) has recently emerged as a
popular framework for cooperative Multi-Agent Reinforcement Learning (MARL), where …

Tizero: Mastering multi-agent football with curriculum learning and self-play

F Lin, S Huang, T Pearce, W Chen, WW Tu - arxiv preprint arxiv …, 2023‏ - arxiv.org
Multi-agent football poses an unsolved challenge in AI research. Existing work has focused
on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In …

A survey on large-population systems and scalable multi-agent reinforcement learning

K Cui, A Tahir, G Ekinci, A Elshamanhory… - arxiv preprint arxiv …, 2022‏ - arxiv.org
The analysis and control of large-population systems is of great interest to diverse areas of
research and engineering, ranging from epidemiology over robotic swarms to economics …

More centralized training, still decentralized execution: Multi-agent conditional policy factorization

J Wang, D Ye, Z Lu - arxiv preprint arxiv:2209.12681, 2022‏ - arxiv.org
In cooperative multi-agent reinforcement learning (MARL), combining value decomposition
with actor-critic enables agents to learn stochastic policies, which are more suitable for the …

Controlling behavioral diversity in multi-agent reinforcement learning

M Bettini, R Kortvelesy, A Prorok - arxiv preprint arxiv:2405.15054, 2024‏ - arxiv.org
The study of behavioral diversity in Multi-Agent Reinforcement Learning (MARL) is a
nascent yet promising field. In this context, the present work deals with the question of how …

[PDF][PDF] Exploration via Joint Policy Diversity for Sparse-Reward Multi-Agent Tasks.

P Xu, J Zhang, K Huang - IJCAI, 2023‏ - ijcai.org
Exploration under sparse rewards is a key challenge for multi-agent reinforcement learning
problems. Previous works argue that complex dynamics between agents and the huge …

Attention-guided contrastive role representations for multi-agent reinforcement learning

Z Hu, Z Zhang, H Li, C Chen, H Ding… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Real-world multi-agent tasks usually involve dynamic team composition with the emergence
of roles, which should also be a key to efficient cooperation in multi-agent reinforcement …