Svdqunat: Absorbing outliers by low-rank components for 4-bit diffusion models
Diffusion models have been proven highly effective at generating high-quality images.
However, as these models grow larger, they require significantly more memory and suffer …
However, as these models grow larger, they require significantly more memory and suffer …
Real-time video generation with pyramid attention broadcast
We present Pyramid Attention Broadcast (PAB), a real-time, high quality and training-free
approach for DiT-based video generation. Our method is founded on the observation that …
approach for DiT-based video generation. Our method is founded on the observation that …
Lazydit: Lazy learning for the acceleration of diffusion transformers
Diffusion Transformers have emerged as the preeminent models for a wide array of
generative tasks, demonstrating superior performance and efficacy across various …
generative tasks, demonstrating superior performance and efficacy across various …
Accelerating diffusion transformers with token-wise feature caching
Diffusion transformers have shown significant effectiveness in both image and video
synthesis at the expense of huge computation costs. To address this problem, feature …
synthesis at the expense of huge computation costs. To address this problem, feature …
Unveiling Redundancy in Diffusion Transformers (DiTs): A Systematic Study
X Sun, J Fang, A Li, J Pan - arxiv preprint arxiv:2411.13588, 2024 - arxiv.org
The increased model capacity of Diffusion Transformers (DiTs) and the demand for
generating higher resolutions of images and videos have led to a significant rise in inference …
generating higher resolutions of images and videos have led to a significant rise in inference …
Layer-and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
Diffusion Transformers (DiTs) have achieved state-of-the-art (SOTA) image generation
quality but suffer from high latency and memory inefficiency, making them difficult to deploy …
quality but suffer from high latency and memory inefficiency, making them difficult to deploy …
CAT Pruning: Cluster-Aware Token Pruning For Text-to-Image Diffusion Models
Diffusion models have revolutionized generative tasks, especially in the domain of text-to-
image synthesis; however, their iterative denoising process demands substantial …
image synthesis; however, their iterative denoising process demands substantial …
Effortless Efficiency: Low-Cost Pruning of Diffusion Models
Diffusion models have achieved impressive advancements in various vision tasks. However,
these gains often rely on increasing model size, which escalates computational complexity …
these gains often rely on increasing model size, which escalates computational complexity …
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
As a fundamental backbone for video generation, diffusion models are challenged by low
inference speed due to the sequential nature of denoising. Previous methods speed up the …
inference speed due to the sequential nature of denoising. Previous methods speed up the …
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Diffusion models are pivotal for generating high-quality images and videos. Inspired by the
success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to …
success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to …