PIXART-: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

J Chen, C Ge, E **e, Y Wu, L Yao, X Ren… - … on Computer Vision, 2024 - Springer
In this paper, we introduce PixArt-Σ, a Diffusion Transformer model (DiT) capable of directly
generating images at 4K resolution. PixArt-Σ represents a significant advancement over its …

Fast high-resolution image synthesis with latent adversarial diffusion distillation

A Sauer, F Boesel, T Dockhorn, A Blattmann… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org
Diffusion models are the main driver of progress in image and video synthesis, but suffer
from slow inference speed. Distillation methods, like the recently introduced adversarial …

Mobilediffusion: Instant text-to-image generation on mobile devices

Y Zhao, Y Xu, Z **ao, H Jia, T Hou - European Conference on Computer …, 2024 - Springer
The deployment of large-scale text-to-image diffusion models on mobile devices is impeded
by their substantial model size and high latency. In this paper, we present MobileDiffusion …

AlphaFold meets flow matching for generating protein ensembles

B **g, B Berger, T Jaakkola - arxiv preprint arxiv:2402.04845, 2024 - arxiv.org
The biological functions of proteins often depend on dynamic structural ensembles. In this
work, we develop a flow-based generative modeling approach for learning and sampling the …

One-step effective diffusion network for real-world image super-resolution

R Wu, L Sun, Z Ma, L Zhang - Advances in Neural …, 2025 - proceedings.neurips.cc
The pre-trained text-to-image diffusion models have been increasingly employed to tackle
the real-world image super-resolution (Real-ISR) problem due to their powerful generative …

Advances in diffusion models for image data augmentation: A review of methods, models, evaluation metrics and future research directions

P Alimisis, I Mademlis, P Radoglou-Grammatikis… - Artificial Intelligence …, 2025 - Springer
Image data augmentation constitutes a critical methodology in modern computer vision
tasks, since it can facilitate towards enhancing the diversity and quality of training datasets; …

One-step image translation with text-to-image models

G Parmar, T Park, S Narasimhan, JY Zhu - arxiv preprint arxiv:2403.12036, 2024 - arxiv.org
In this work, we address two limitations of existing conditional diffusion models: their slow
inference speed due to the iterative denoising process and their reliance on paired data for …

Improved distribution matching distillation for fast image synthesis

T Yin, M Gharbi, T Park, R Zhang, E Shechtman… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent approaches have shown promises distilling diffusion models into efficient one-step
generators. Among them, Distribution Matching Distillation (DMD) produces one-step …

Distilling diffusion models into conditional gans

M Kang, R Zhang, C Barnes, S Paris, S Kwak… - … on Computer Vision, 2024 - Springer
We propose a method to distill a complex multistep diffusion model into a single-step
conditional GAN student model, dramatically accelerating inference, while preserving image …

Multistep distillation of diffusion models via moment matching

T Salimans, T Mensink, J Heek… - Advances in Neural …, 2025 - proceedings.neurips.cc
We present a new method for making diffusion models faster to sample. The method distills
many-step diffusion models into few-step models by matching conditional expectations of …