A survey on video diffusion models

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Motiondirector: Motion customization of text-to-video diffusion models

R Zhao, Y Gu, JZ Wu, DJ Zhang, JW Liu, W Wu… - … on Computer Vision, 2024 - Springer
Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse
video generations. Given a set of video clips of the same motion concept, the task of Motion …

Physgen: Rigid-body physics-grounded image-to-video generation

S Liu, Z Ren, S Gupta, S Wang - European Conference on Computer …, 2024 - Springer
We present PhysGen, a novel image-to-video generation method that converts a single
image and an input condition (eg., force and torque applied to an object in the image) to …

Direct-a-video: Customized video generation with user-directed camera movement and object motion

S Yang, L Hou, H Huang, C Ma, P Wan… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Recent text-to-video diffusion models have achieved impressive progress. In practice, users
often desire the ability to control object motion and camera movement independently for …

Customize-a-video: One-shot motion customization of text-to-video diffusion models

Y Ren, Y Zhou, J Yang, J Shi, D Liu, F Liu… - … on Computer Vision, 2024 - Springer
Image customization has been extensively studied in text-to-image (T2I) diffusion models,
leading to impressive outcomes and applications. With the emergence of text-to-video (T2V) …

A recipe for scaling up text-to-video generation with text-free videos

X Wang, S Zhang, H Yuan, Z Qing… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion-based text-to-video generation has witnessed impressive progress in the past year
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …

Make-your-3d: Fast and consistent subject-driven 3d content generation

F Liu, H Wang, W Chen, H Sun, Y Duan - European Conference on …, 2024 - Springer
Recent years have witnessed the strong power of 3D generation models, which offer a new
level of creative flexibility by allowing users to guide the 3D content generation process …

Motionbooth: Motion-aware customized text-to-video generation

J Wu, X Li, Y Zeng, J Zhang, Q Zhou, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present MotionBooth, an innovative framework designed for animating
customized subjects with precise control over both object and camera movements. By …

InstructVideo: instructing video diffusion models with human feedback

H Yuan, S Zhang, X Wang, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …

Videolcm: Video latent consistency model

X Wang, S Zhang, H Zhang, Y Liu, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Consistency models have demonstrated powerful capability in efficient image generation
and allowed synthesis within a few sampling steps, alleviating the high computational cost in …