Distilling diffusion models into conditional gans

M Kang, R Zhang, C Barnes, S Paris, S Kwak… - … on Computer Vision, 2024 - Springer
We propose a method to distill a complex multistep diffusion model into a single-step
conditional GAN student model, dramatically accelerating inference, while preserving image …

Representation alignment for generation: Training diffusion transformers is easier than you think

S Yu, S Kwak, H Jang, J Jeong, J Huang, J Shin… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent studies have shown that the denoising process in (generative) diffusion models can
induce meaningful (discriminative) representations inside the model, though the quality of …

Efficient diffusion models: A comprehensive survey from principles to practices

Z Ma, Y Zhang, G Jia, L Zhao, Y Ma, M Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
As one of the most popular and sought-after generative models in the recent years, diffusion
models have sparked the interests of many researchers and steadily shown excellent …

Video diffusion alignment via reward gradients

M Prabhudesai, R Mendonca, Z Qin… - arxiv preprint arxiv …, 2024 - arxiv.org
We have made significant progress towards building foundational video diffusion models. As
these models are trained using large-scale unsupervised data, it has become crucial to …

Diffusion models and representation learning: A survey

M Fuest, P Ma, M Gui, JS Fischer, VT Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion Models are popular generative modeling methods in various vision tasks, attracting
significant attention. They can be considered a unique instance of self-supervised learning …

Sledge: Synthesizing driving environments with generative models and rule-based traffic

K Chitta, D Dauner, A Geiger - European Conference on Computer Vision, 2024 - Springer
SLEDGE is the first generative simulator for vehicle motion planning trained on real-world
driving logs. Its core component is a learned model that is able to generate agent bounding …

Alignment of diffusion models: Fundamentals, challenges, and future

B Liu, S Shao, B Li, L Bai, Z Xu, H **ong, J Kwok… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have emerged as the leading paradigm in generative modeling, excelling
in various applications. Despite their success, these models often misalign with human …

Spaceblender: Creating context-rich collaborative spaces through generative 3d scene blending

N Numan, S Rajaram, BT Kumaravel… - Proceedings of the 37th …, 2024 - dl.acm.org
There is increased interest in using generative AI to create 3D spaces for Virtual Reality (VR)
applications. However, today's models produce artificial environments, falling short of …

Osv: One step is enough for high-quality image to video generation

X Mao, Z Jiang, FY Wang, W Zhu, J Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Video diffusion models have shown great potential in generating high-quality videos,
making them an increasingly popular focus. However, their inherent iterative nature leads to …

Draw an audio: Leveraging multi-instruction for video-to-audio synthesis

Q Yang, B Mao, Z Wang, X Nie, P Gao, Y Guo… - arxiv preprint arxiv …, 2024 - arxiv.org
Foley is a term commonly used in filmmaking, referring to the addition of daily sound effects
to silent films or videos to enhance the auditory experience. Video-to-Audio (V2A), as a …