A survey on video diffusion models
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
Miradata: A large-scale video dataset with long durations and structured captions
Sora's high-motion intensity and long consistent videos have significantly impacted the field
of video generation, attracting unprecedented attention. However, existing publicly available …
of video generation, attracting unprecedented attention. However, existing publicly available …
Dreamvideo: Composing your dream videos with customized subject and motion
Customized generation using diffusion models has made impressive progress in image
generation but remains unsatisfactory in the challenging video generation task as it requires …
generation but remains unsatisfactory in the challenging video generation task as it requires …
A recipe for scaling up text-to-video generation with text-free videos
Diffusion-based text-to-video generation has witnessed impressive progress in the past year
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …
InstructVideo: instructing video diffusion models with human feedback
Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …
their reliance on web-scale data of varied quality often yields results that are visually …
Dreamtalk: When expressive talking head generation meets diffusion probabilistic models
Diffusion models have shown remarkable success in a variety of downstream generative
tasks, yet remain under-explored in the important and challenging expressive talking head …
tasks, yet remain under-explored in the important and challenging expressive talking head …
T2v-turbo-v2: Enhancing video generation model post-training through data, reward, and conditional guidance design
In this paper, we focus on enhancing a diffusion-based text-to-video (T2V) model during the
post-training phase by distilling a highly capable consistency model from a pretrained T2V …
post-training phase by distilling a highly capable consistency model from a pretrained T2V …
Osv: One step is enough for high-quality image to video generation
Video diffusion models have shown great potential in generating high-quality videos,
making them an increasingly popular focus. However, their inherent iterative nature leads to …
making them an increasingly popular focus. However, their inherent iterative nature leads to …
Towards a mathematical theory for consistency training in diffusion models
Consistency models, which were proposed to mitigate the high computational overhead
during the sampling phase of diffusion models, facilitate single-step sampling while attaining …
during the sampling phase of diffusion models, facilitate single-step sampling while attaining …
AudioLCM: Efficient and High-Quality Text-to-Audio Generation with Minimal Inference Steps
Recent advancements in Latent Diffusion Models (LDMs) have propelled them to the
forefront of various generative tasks. However, their iterative sampling process poses a …
forefront of various generative tasks. However, their iterative sampling process poses a …