From slow bidirectional to fast causal video generators

T Yin, Q Zhang, R Zhang, WT Freeman… - arxiv preprint arxiv …, 2024 - arxiv.org
Current video diffusion models achieve impressive generation quality but struggle in
interactive applications due to bidirectional attention dependencies. The generation of a …

Stable Consistency Tuning: Understanding and Improving Consistency Models

FY Wang, Z Geng, H Li - arxiv preprint arxiv:2410.18958, 2024 - arxiv.org
Diffusion models achieve superior generation quality but suffer from slow generation speed
due to the iterative nature of denoising. In contrast, consistency models, a new generative …

Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models

Y Wu, H Wang, Z Chen, D Xu - arxiv preprint arxiv:2411.18375, 2024 - arxiv.org
The high computational cost and slow inference time are major obstacles to deploying the
video diffusion model (VDM) in practical applications. To overcome this, we introduce a new …

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Y Wu, Z Zhang, Y Li, Y Xu, A Kag, Y Sui… - arxiv preprint arxiv …, 2024 - arxiv.org
We have witnessed the unprecedented success of diffusion-based video generation over
the past year. Recently proposed models from the community have wielded the power to …

Real-time One-Step Diffusion-based Expressive Portrait Videos Generation

H Guo, H Yi, D Zhou, AW Bergman… - arxiv preprint arxiv …, 2024 - arxiv.org
Latent diffusion models have made great strides in generating expressive portrait videos
with accurate lip-sync and natural motion from a single reference image and audio input …