Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Decision mamba: A multi-grained state space model with self-evolution regularization for offline rl

Q Lv, X Deng, G Chen, MY Wang… - Advances in Neural …, 2025 - proceedings.neurips.cc
While the conditional sequence modeling with the transformer architecture has
demonstrated its effectiveness in dealing with offline reinforcement learning (RL) tasks, it is …

Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive

Z Wang, H Wang, Y Qi - arxiv preprint arxiv:2412.00979, 2024 - arxiv.org
Decision transformers recast reinforcement learning as a conditional sequence generation
problem, offering a simple but effective alternative to traditional value or policy-based …

Advances in Transformers for Robotic Applications: A Review

N Sanghai, NB Brown - arxiv preprint arxiv:2412.10599, 2024 - arxiv.org
The introduction of Transformers architecture has brought about significant breakthroughs in
Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their …

PrefMMT: Modeling Human Preferences in Preference-based Reinforcement Learning with Multimodal Transformers

D Zhao, R Wang, D Suh, T Kim, Z Yuan, BC Min… - arxiv preprint arxiv …, 2024 - arxiv.org
Preference-based reinforcement learning (PbRL) shows promise in aligning robot behaviors
with human preferences, but its success depends heavily on the accurate modeling of …

Evaluating Durability: Benchmark Insights into Image and Text Watermarking

J Qiu, W Han, X Zhao, S Long… - Journal of Data-centric …, 2024 - openreview.net
As large models become increasingly prevalent, watermarking has emerged as a crucial
technology for copyright protection, authenticity verification, and content tracking. The rise of …

Provable Algorithms for Reinforcement Learning: Efficiency, Scalability, and Robustness

L Shi - 2023 - search.proquest.com
Reinforcement learning (RL), which strives to learn desirable sequential decisions based on
trial-and-error interactions with an unknown environment, has achieved remarkable success …

基于表征学**的离线**化学**方法研究综述

王雪松, 王荣荣, 程玉虎 - 自动化学报, 2024 - aas.net.cn
**化学**(Reinforcement learning, RL) 通过智能体与环境在线交互来学**最优策略,
**年来已成为解决复杂环境下感知决策问题的重要手段. 然而, 在线收集数据的方式可能会引发 …

[PDF][PDF] A Generalization Perspective on Model-Based Offline Reinforcement Learning

P Nair - research.tue.nl
Abstract Offline Reinforcement Learning (RL) has become a cost-effective data-driven
approach in the realm of Deep Reinforcement Learning, addressing safety and …