A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arxiv preprint arxiv …, 2023‏ - arxiv.org
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024‏ - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models

C Mou, X Wang, L **e, Y Wu, J Zhang, Z Qi… - Proceedings of the AAAI …, 2024‏ - ojs.aaai.org
The incredible generative ability of large-scale text-to-image (T2I) models has demonstrated
strong power of learning complex structures and meaningful semantics. However, relying …

Dreambooth3d: Subject-driven text-to-3d generation

A Raj, S Kaza, B Poole, M Niemeyer… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
We present DreamBooth3D, an approach to personalize text-to-3D generative models from
as few as 3-6 casually captured images of a subject. Our approach combines recent …

Text-to-image diffusion models in generative ai: A survey

C Zhang, C Zhang, M Zhang, IS Kweon - arxiv preprint arxiv:2303.07909, 2023‏ - arxiv.org
This survey reviews text-to-image diffusion models in the context that diffusion models have
emerged to be popular for a wide range of generative tasks. As a self-contained work, this …

Pix2video: Video editing using image diffusion

D Ceylan, CHP Huang, NJ Mitra - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com
Image diffusion models, trained on massive image collections, have emerged as the most
versatile image generator model in terms of quality and diversity. They support inverting real …

Dense text-to-image generation with attention modulation

Y Kim, J Lee, JH Kim, JW Ha… - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com
Existing text-to-image diffusion models struggle to synthesize realistic images given dense
captions, where each text prompt provides a detailed description for a specific image region …

Composer: Creative and controllable image synthesis with composable conditions

L Huang, D Chen, Y Liu, Y Shen, D Zhao… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Recent large-scale generative models learned on big data are capable of synthesizing
incredible images yet suffer from limited controllability. This work offers a new generation …

Dreamix: Video diffusion models are general video editors

E Molad, E Horwitz, D Valevski, AR Acha… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Text-driven image and video diffusion models have recently achieved unprecedented
generation realism. While diffusion models have been successfully applied for image …

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Y Guo, C Yang, A Rao, M Agrawala, D Lin… - European Conference on …, 2024‏ - Springer
The development of text-to-video (T2V), ie, generating videos with a given text prompt, has
been significantly advanced in recent years. However, relying solely on text prompts often …