A survey of progress on cooperative multi-agent reinforcement learning in open environment

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arxiv preprint arxiv:2312.01058, 2023 - arxiv.org
Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …

Graph decision transformer

S Hu, L Shen, Y Zhang, D Tao - arxiv preprint arxiv:2303.03747, 2023 - arxiv.org
Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies
from static trajectory data without interacting with the environment. Recently, offline RL has …

Decomposed Prompt Decision Transformer for Efficient Unseen Task Generalization

H Zheng, L Shen, Y Luo, T Liu… - Advances in Neural …, 2025 - proceedings.neurips.cc
Multi-task offline reinforcement learning aims to develop a unified policy for diverse tasks
without requiring real-time interaction with the environment. Recent work explores sequence …

Prompt-tuning decision transformer with preference ranking

S Hu, L Shen, Y Zhang, D Tao - arxiv preprint arxiv:2305.09648, 2023 - arxiv.org
Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …

Saformer: A conditional sequence modeling approach to offline safe reinforcement learning

Q Zhang, L Zhang, H Xu, L Shen, B Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Offline safe RL is of great practical relevance for deploying agents in real-world applications.
However, acquiring constraint-satisfying policies from the fixed dataset is non-trivial for …

Pdit: Interleaving perception and decision-making transformers for deep reinforcement learning

H Mao, R Zhao, Z Li, Z Xu, H Chen, Y Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work studies the former. Specifically, the Perception and …

Instructed diffuser with temporal condition guidance for offline reinforcement learning

J Hu, Y Sun, S Huang, SY Guo, H Chen, L Shen… - arxiv preprint arxiv …, 2023 - arxiv.org
Recent works have shown the potential of diffusion models in computer vision and natural
language processing. Apart from the classical supervised learning fields, diffusion models …

Transformer in transformer as backbone for deep reinforcement learning

H Mao, R Zhao, H Chen, J Hao, Y Chen, D Li… - arxiv preprint arxiv …, 2022 - arxiv.org
Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work focuses on the former. Previous methods build the network …

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action …

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

S Hu, Z Fan, L Shen, Y Zhang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy
applicable to diverse tasks without the need for online environmental interaction. Recent …