A brief overview of ChatGPT: The history, status quo and potential future development

T Wu, S He, J Liu, S Sun, K Liu… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org
ChatGPT, an artificial intelligence generated content (AIGC) model developed by OpenAI,
has attracted world-wide attention for its capability of dealing with challenging language …

Diffusion models: A comprehensive survey of methods and applications

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

Animatediff: Animate your personalized text-to-image diffusion models without specific tuning

Y Guo, C Yang, A Rao, Z Liang, Y Wang, Y Qiao… - arxiv preprint arxiv …, 2023 - arxiv.org
With the advance of text-to-image (T2I) diffusion models (eg, Stable Diffusion) and
corresponding personalization techniques such as DreamBooth and LoRA, everyone can …

Text2video-zero: Text-to-image diffusion models are zero-shot video generators

L Khachatryan, A Movsisyan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-video generation approaches rely on computationally heavy training and
require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …

Sdxl: Improving latent diffusion models for high-resolution image synthesis

D Podell, Z English, K Lacey, A Blattmann… - arxiv preprint arxiv …, 2023 - arxiv.org
We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to
previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone …

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation

JZ Wu, Y Ge, X Wang, SW Lei, Y Gu… - Proceedings of the …, 2023 - openaccess.thecvf.com
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …

Fatezero: Fusing attentions for zero-shot text-based video editing

C Qi, X Cun, Y Zhang, C Lei, X Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The diffusion-based generative models have achieved remarkable success in text-based
image generation. However, since it contains enormous randomness in generation …

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024 - Springer
This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …