Deep learning modelling techniques: current progress, applications, advantages, and challenges

SF Ahmed, MSB Alam, M Hassan, MR Rozbu… - Artificial Intelligence …, 2023 - Springer
Deep learning (DL) is revolutionizing evidence-based decision-making techniques that can
be applied across various sectors. Specifically, it possesses the ability to utilize two or more …

A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

Make-a-video: Text-to-video generation without text-video data

U Singer, A Polyak, T Hayes, X Yin, J An… - arxiv preprint arxiv …, 2022 - arxiv.org
We propose Make-A-Video--an approach for directly translating the tremendous recent
progress in Text-to-Image (T2I) generation to Text-to-Video (T2V). Our intuition is simple …

Tifa: Accurate and interpretable text-to-image faithfulness evaluation with question answering

Y Hu, B Liu, J Kasai, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Despite thousands of researchers, engineers, and artists actively working on improving text-
to-image generation models, systems often fail to produce images that accurately align with …

Show-1: Marrying pixel and latent diffusion models for text-to-video generation

DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu… - International Journal of …, 2024 - Springer
Significant advancements have been achieved in the realm of large-scale pre-trained text-to-
video Diffusion Models (VDMs). However, previous methods either rely solely on pixel …

[PDF][PDF] Scaling autoregressive models for content-rich text-to-image generation

J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang… - arxiv preprint arxiv …, 2022 - 3dvar.com
Abstract We present the Pathways [1] Autoregressive Text-to-Image (Parti) model, which
generates high-fidelity photorealistic images and supports content-rich synthesis involving …

Layoutdiffusion: Controllable diffusion model for layout-to-image generation

G Zheng, X Zhou, X Li, Z Qi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, diffusion models have achieved great success in image synthesis. However, when
it comes to the layout-to-image generation where an image often has a complex scene of …

Galip: Generative adversarial clips for text-to-image synthesis

M Tao, BK Bao, H Tang, C Xu - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Synthesizing high-fidelity complex images from text is challenging. Based on large
pretraining, the autoregressive and diffusion models can synthesize photo-realistic images …

Spatext: Spatio-textual representation for controllable image generation

O Avrahami, T Hayes, O Gafni… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-image diffusion models are able to generate convincing results of
unprecedented quality. However, it is nearly impossible to control the shapes of different …

Training-free structured diffusion guidance for compositional text-to-image synthesis

W Feng, X He, TJ Fu, V Jampani, A Akula… - arxiv preprint arxiv …, 2022 - arxiv.org
Large-scale diffusion models have achieved state-of-the-art results on text-to-image
synthesis (T2I) tasks. Despite their ability to generate high-quality yet creative images, we …