On Transforming Reinforcement Learning With Transformers: The Development Trajectory

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

Critic-guided decision transformer for offline reinforcement learning

Y Wang, C Yang, Y Wen, Y Liu, Y Qiao - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the …

Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Crossway diffusion: Improving diffusion-based visuomotor policy via self-supervised learning

X Li, V Belagali, J Shang… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Diffusion models have been adopted for behavioral cloning in a sequence modeling
fashion, benefiting from their exceptional capabilities in modeling complex data distributions …

Learning multi-agent communication from graph modeling perspective

S Hu, L Shen, Y Zhang, D Tao - arxiv preprint arxiv:2405.08550, 2024 - arxiv.org
In numerous artificial intelligence applications, the collaborative efforts of multiple intelligent
agents are imperative for the successful attainment of target objectives. To enhance …

Prompt-tuning decision transformer with preference ranking

S Hu, L Shen, Y Zhang, D Tao - arxiv preprint arxiv:2305.09648, 2023 - arxiv.org
Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …

Pdit: Interleaving perception and decision-making transformers for deep reinforcement learning

H Mao, R Zhao, Z Li, Z Xu, H Chen, Y Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work studies the former. Specifically, the Perception and …

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action …

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

S Hu, Z Fan, L Shen, Y Zhang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy
applicable to diverse tasks without the need for online environmental interaction. Recent …

Gradformer: Graph Transformer with Exponential Decay

C Liu, Z Yao, Y Zhan, X Ma, S Pan, W Hu - arxiv preprint arxiv:2404.15729, 2024 - arxiv.org
Graph Transformers (GTs) have demonstrated their advantages across a wide range of
tasks. However, the self-attention mechanism in GTs overlooks the graph's inductive biases …