Enhancing deep reinforcement learning: A tutorial on generative diffusion models in network optimization

H Du, R Zhang, Y Liu, J Wang, Y Lin… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
Generative Diffusion Models (GDMs) have emerged as a transformative force in the realm of
Generative Artificial Intelligence (GenAI), demonstrating their versatility and efficacy across …

Opportunities and challenges of diffusion models for generative AI

M Chen, S Mei, J Fan, M Wang - National Science Review, 2024 - academic.oup.com
Diffusion models, a powerful and universal generative artificial intelligence technology, have
achieved tremendous success and opened up new possibilities in diverse applications. In …

Tfg: Unified training-free guidance for diffusion models

H Ye, H Lin, J Han, M Xu, S Liu… - Advances in …, 2025 - proceedings.neurips.cc
Given an unconditional diffusion model and a predictor for a target property of interest (eg, a
classifier), the goal of training-free guidance is to generate samples with desirable target …

Diffusion-dice: In-sample diffusion guidance for offline reinforcement learning

L Mao, H Xu, X Zhan, W Zhang… - Advances in Neural …, 2025 - proceedings.neurips.cc
One important property of DIstribution Correction Estimation (DICE) methods is that the
solution is the optimal stationary distribution ratio between the optimized and data collection …

Amortizing intractable inference in diffusion models for vision, language, and control

S Venkatraman, M Jain, L Scimeca, M Kim… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have emerged as effective distribution estimators in vision, language, and
reinforcement learning, but their use as priors in downstream tasks poses an intractable …

Consistency models as a rich and efficient policy class for reinforcement learning

Z Ding, C ** - arxiv preprint arxiv:2309.16984, 2023 - arxiv.org
Score-based generative models like the diffusion model have been testified to be effective in
modeling multi-modal data from image generation to reinforcement learning (RL). However …

Noise contrastive alignment of language models with explicit rewards

H Chen, G He, L Yuan, G Cui, H Su, J Zhu - arxiv preprint arxiv …, 2024 - arxiv.org
User intentions are typically formalized as evaluation rewards to be maximized when fine-
tuning language models (LMs). Existing alignment methods, such as Direct Preference …

Safe offline reinforcement learning with feasibility-guided diffusion model

Y Zheng, J Li, D Yu, Y Yang, SE Li, X Zhan… - arxiv preprint arxiv …, 2024 - arxiv.org
Safe offline RL is a promising way to bypass risky online interactions towards safe policy
learning. Most existing methods only enforce soft constraints, ie, constraining safety …

Diffusion-ES: Gradient-free planning with diffusion for autonomous and instruction-guided driving

B Yang, H Su, N Gkanatsios, TW Ke… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion models excel at modeling complex and multimodal trajectory distributions for
decision-making and control. Reward-gradient guided denoising has been recently …

Simple hierarchical planning with diffusion

C Chen, F Deng, K Kawaguchi, C Gulcehre… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion-based generative methods have proven effective in modeling trajectories with
offline datasets. However, they often face computational challenges and can falter in …