A survey on video diffusion models

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024‏ - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024‏ - Springer
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

Gauhuman: Articulated gaussian splatting from monocular human videos

S Hu, T Hu, Z Liu - … of the IEEE/CVF conference on …, 2024‏ - openaccess.thecvf.com
We present GauHuman a 3D human model with Gaussian Splatting for both fast training (1 2
minutes) and real-time rendering (up to 189 FPS) compared with existing NeRF-based …

Disco: Disentangled control for realistic human dance generation

T Wang, L Li, K Lin, Y Zhai, CC Lin… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Generative AI has made significant strides in computer vision particularly in text-driven
image/video synthesis (T2I/T2V). Despite the notable advancements it remains challenging …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

From sora what we can see: A survey of text-to-video generation

R Sun, Y Zhang, T Shah, J Sun, S Zhang, W Li… - arxiv preprint arxiv …, 2024‏ - arxiv.org
With impressive achievements made, artificial intelligence is on the path forward to artificial
general intelligence. Sora, developed by OpenAI, which is capable of minute-level world …

Freeinit: Bridging initialization gap in video diffusion models

T Wu, C Si, Y Jiang, Z Huang, Z Liu - European Conference on Computer …, 2024‏ - Springer
Though diffusion-based video generation has witnessed rapid progress, the inference
results of existing models still exhibit unsatisfactory temporal consistency and unnatural …

Compositional abilities emerge multiplicatively: Exploring diffusion models on a synthetic task

M Okawa, ES Lubana, R Dick… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Modern generative models exhibit unprecedented capabilities to generate extremely
realistic data. However, given the inherent compositionality of the real world, reliable use of …

Generative semantic communication: Diffusion models beyond bit recovery

E Grassucci, S Barbarossa, D Comminiello - arxiv preprint arxiv …, 2023‏ - arxiv.org
Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …