A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

L Alzubaidi, J Bai, A Al-Sabaawi, J Santamaría… - Journal of Big Data, 2023 - Springer
Data scarcity is a major challenge when training deep learning (DL) models. DL demands a
large amount of data to achieve exceptional performance. Unfortunately, many applications …

Understanding GANs: Fundamentals, variants, training challenges, applications, and open problems

Z Ahmad, ZA Jaffri, M Chen, S Bao - Multimedia Tools and Applications, 2024 - Springer
Generative adversarial networks (GANs), a novel framework for training generative models
in an adversarial setup, have attracted significant attention in recent years. The two …

Cross-view panorama image synthesis with progressive attention GANs

S Wu, H Tang, XY **g, J Qian, N Sebe, Y Yan… - Pattern Recognition, 2022 - Elsevier
Despite the significant progress of conditional image generation, it remains difficult to
synthesize a ground-view panorama image from a top-view aerial image. Among the core …

Vision-language matching for text-to-image synthesis via generative adversarial networks

Q Cheng, K Wen, X Gu - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org
Text-to-image synthesis is an attractive but challenging task that aims to generate a photo-
realistic and semantic consistent image from a specific text description. The images …

Controllable image synthesis methods, applications and challenges: a comprehensive survey

S Huang, Q Li, J Liao, S Wang, L Liu, L Li - Artificial Intelligence Review, 2024 - Springer
Abstract Controllable Image Synthesis (CIS) is a methodology that allows users to generate
desired images or manipulate specific attributes of images by providing precise input …

Vision+ language applications: A survey

Y Zhou, N Shimada - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Text-to-image generation has attracted significant interest from researchers and practitioners
in recent years due to its widespread and diverse applications across various industries …

Text-based person search without parallel image-text data

Y Bai, J Wang, M Cao, C Chen, Z Cao, L Nie… - Proceedings of the 31st …, 2023 - dl.acm.org
Text-based person search (TBPS) aims to retrieve the images of the target person from a
large image gallery based on a given natural language description. Existing methods are …

MISL: Multi-grained image-text semantic learning for text-guided image inpainting

X Wu, K Zhao, Q Huang, Q Wang, Z Yang, G Hao - Pattern Recognition, 2024 - Elsevier
Text-guided image inpainting aims to generate corrupted image patches and obtain a
plausible image based on textual descriptions, considering the relationship between textual …

Where you edit is what you get: Text-guided image editing with region-based attention

C **ao, Q Yang, X Xu, J Zhang, F Zhou, C Zhang - Pattern Recognition, 2023 - Elsevier
Leveraging the abundant knowledge learned from pre-trained multi-modal models like CLIP
has recently proved to be effective for text-guided image editing. Though convincing results …

Semantic Similarity Distance: Towards better text-image consistency metric in text-to-image generation

Z Tan, X Yang, Z Ye, Q Wang, Y Yan, A Nguyen… - Pattern Recognition, 2023 - Elsevier
Generating high-quality images from text remains a challenge in visual-language
understanding, with text-image consistency being a major concern. Particularly, the most …