[HTML][HTML] The power of generative ai: A review of requirements, models, input–output formats, evaluation metrics, and challenges

A Bandi, PVSR Adapa, YEVPK Kuchi - Future Internet, 2023 - mdpi.com
Generative artificial intelligence (AI) has emerged as a powerful technology with numerous
applications in various domains. There is a need to identify the requirements and evaluation …

A comprehensive overview and comparative analysis on deep learning models: CNN, RNN, LSTM, GRU

FM Shiri, T Perumal, N Mustapha… - arxiv preprint arxiv …, 2023 - arxiv.org
Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and
artificial intelligence (AI), outperforming traditional ML methods, especially in handling …

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

M Cao, X Wang, Z Qi, Y Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Despite the success in large-scale text-to-image generation and text-conditioned image
editing, existing methods still struggle to produce consistent generation and editing results …

Vqgan-clip: Open domain image generation and editing with natural language guidance

K Crowson, S Biderman, D Kornis, D Stander… - European conference on …, 2022 - Springer
Generating and editing images from open domain text prompts is a challenging task that
heretofore has required expensive and specially trained models. We demonstrate a novel …

Text2live: Text-driven layered image and video editing

O Bar-Tal, D Ofri-Amar, R Fridman, Y Kasten… - European conference on …, 2022 - Springer
We present a method for zero-shot, text-driven editing of natural images and videos. Given
an image or a video and a text prompt, our goal is to edit the appearance of existing objects …

Styleclip: Text-driven manipulation of stylegan imagery

O Patashnik, Z Wu, E Shechtman… - Proceedings of the …, 2021 - openaccess.thecvf.com
Inspired by the ability of StyleGAN to generate highly re-alistic images in a variety of
domains, much recent work hasfocused on understanding how to use the latent spaces …

Guiding instruction-based image editing via multimodal large language models

TJ Fu, W Hu, X Du, WY Wang, Y Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
Instruction-based image editing improves the controllability and flexibility of image
manipulation via natural commands without elaborate descriptions or regional masks …

Tedigan: Text-guided diverse face image generation and manipulation

W **a, Y Yang, JH Xue, B Wu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
In this work, we propose TediGAN, a novel framework for multi-modal image generation and
manipulation with textual descriptions. The proposed method consists of three components …

More control for free! image synthesis with semantic diffusion guidance

X Liu, DH Park, S Azadi, G Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Controllable image synthesis models allow creation of diverse images based on text
instructions or guidance from a reference image. Recently, denoising diffusion probabilistic …

A survey on generative adversarial networks: Variants, applications, and training

A Jabbar, X Li, B Omar - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
The Generative Models have gained considerable attention in unsupervised learning via a
new and practical framework called Generative Adversarial Networks (GAN) due to their …