A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arxiv preprint arxiv …, 2023‏ - arxiv.org
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

A survey on deep learning applied to medical images: from simple artificial neural networks to generative models

P Celard, EL Iglesias, JM Sorribes-Fdez… - Neural Computing and …, 2023‏ - Springer
Deep learning techniques, in particular generative models, have taken on great importance
in medical image analysis. This paper surveys fundamental deep learning concepts related …

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

Stable video diffusion: Scaling latent video diffusion models to large datasets

A Blattmann, T Dockhorn, S Kulal… - arxiv preprint arxiv …, 2023‏ - arxiv.org
We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

Videocrafter2: Overcoming data limitations for high-quality video diffusion models

H Chen, Y Zhang, X Cun, M **a… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Text-to-video generation aims to produce a video based on a given prompt. Recently
several commercial video models have been able to generate plausible videos with minimal …

Preserve your own correlation: A noise prior for video diffusion models

S Ge, S Nah, G Liu, T Poon, A Tao… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Despite tremendous progress in generating high-quality images using diffusion models,
synthesizing a sequence of animated frames that are both photorealistic and temporally …

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024‏ - Springer
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

Pix2video: Video editing using image diffusion

D Ceylan, CHP Huang, NJ Mitra - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com
Image diffusion models, trained on massive image collections, have emerged as the most
versatile image generator model in terms of quality and diversity. They support inverting real …

Latte: Latent diffusion transformer for video generation

X Ma, Y Wang, G Jia, X Chen, Z Liu, YF Li… - arxiv preprint arxiv …, 2024‏ - arxiv.org
We propose a novel Latent Diffusion Transformer, namely Latte, for video generation. Latte
first extracts spatio-temporal tokens from input videos and then adopts a series of …