A survey on video diffusion models

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024 - Springer
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Gauhuman: Articulated gaussian splatting from monocular human videos

S Hu, T Hu, Z Liu - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We present GauHuman a 3D human model with Gaussian Splatting for both fast training (1 2
minutes) and real-time rendering (up to 189 FPS) compared with existing NeRF-based …

Freeinit: Bridging initialization gap in video diffusion models

T Wu, C Si, Y Jiang, Z Huang, Z Liu - European Conference on Computer …, 2024 - Springer
Though diffusion-based video generation has witnessed rapid progress, the inference
results of existing models still exhibit unsatisfactory temporal consistency and unnatural …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Id-animator: Zero-shot identity-preserving human video generation

X He, Q Liu, S Qian, X Wang, T Hu, K Cao… - arxiv preprint arxiv …, 2024 - arxiv.org
Generating high-fidelity human video with specified identities has attracted significant
attention in the content generation community. However, existing techniques struggle to …

Disco: Disentangled control for realistic human dance generation

T Wang, L Li, K Lin, Y Zhai, CC Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative AI has made significant strides in computer vision particularly in text-driven
image/video synthesis (T2I/T2V). Despite the notable advancements it remains challenging …

Appearance and Pose-guided Human Generation: A Survey

F Liao, X Zou, W Wong - ACM Computing Surveys, 2024 - dl.acm.org
Appearance and pose-guided human generation is a burgeoning field that has captured
significant attention. This subject's primary objective is to transfer pose information from a …

Compositional abilities emerge multiplicatively: Exploring diffusion models on a synthetic task

M Okawa, ES Lubana, R Dick… - Advances in Neural …, 2024 - proceedings.neurips.cc
Modern generative models exhibit unprecedented capabilities to generate extremely
realistic data. However, given the inherent compositionality of real world, reliable use of …