Fastcomposer: Tuning-free multi-subject image generation with localized attention
Diffusion models excel at text-to-image generation, especially in subject-driven generation
for personalized images. However, existing methods are inefficient due to the subject …
for personalized images. However, existing methods are inefficient due to the subject …
Representation alignment for generation: Training diffusion transformers is easier than you think
Recent studies have shown that the denoising process in (generative) diffusion models can
induce meaningful (discriminative) representations inside the model, though the quality of …
induce meaningful (discriminative) representations inside the model, though the quality of …
Lvcd: reference-based lineart video colorization with diffusion models
Z Huang, M Zhang, J Liao - ACM Transactions on Graphics (TOG), 2024 - dl.acm.org
We propose the first video diffusion framework for reference-based lineart video colorization.
Unlike previous works that rely solely on image generative models to colorize lineart frame …
Unlike previous works that rely solely on image generative models to colorize lineart frame …
Svdqunat: Absorbing outliers by low-rank components for 4-bit diffusion models
Diffusion models have been proven highly effective at generating high-quality images.
However, as these models grow larger, they require significantly more memory and suffer …
However, as these models grow larger, they require significantly more memory and suffer …
Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
The rapid progress in artificial intelligence-generated content (AIGC), especially with
diffusion models, has significantly advanced development of high-quality video generation …
diffusion models, has significantly advanced development of high-quality video generation …
Diffusion Adversarial Post-Training for One-Step Video Generation
The diffusion models are widely used for image and video generation, but their iterative
generation process is slow and expansive. While existing distillation approaches have …
generation process is slow and expansive. While existing distillation approaches have …
Layer-and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
Diffusion Transformers (DiTs) have achieved state-of-the-art (SOTA) image generation
quality but suffer from high latency and memory inefficiency, making them difficult to deploy …
quality but suffer from high latency and memory inefficiency, making them difficult to deploy …
From slow bidirectional to fast causal video generators
Current video diffusion models achieve impressive generation quality but struggle in
interactive applications due to bidirectional attention dependencies. The generation of a …
interactive applications due to bidirectional attention dependencies. The generation of a …
Adversarial diffusion compression for real-world image super-resolution
Real-world image super-resolution (Real-ISR) aims to reconstruct high-resolution images
from low-resolution inputs degraded by complex, unknown processes. While many Stable …
from low-resolution inputs degraded by complex, unknown processes. While many Stable …
InstantDrag: Improving Interactivity in Drag-based Image Editing
Drag-based image editing has recently gained popularity for its interactivity and precision.
However, despite the ability of text-to-image models to generate samples within a second …
However, despite the ability of text-to-image models to generate samples within a second …