A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …
from society. As a result, many individuals have become interested in related resources and …
Advances in medical image analysis with vision transformers: a comprehensive review
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …
has recently also triggered broad interest in Computer Vision. Among other merits …
Null-text inversion for editing real images using guided diffusion models
Recent large-scale text-guided diffusion models provide powerful image generation
capabilities. Currently, a massive effort is given to enable the modification of these images …
capabilities. Currently, a massive effort is given to enable the modification of these images …
Guiding pretraining in reinforcement learning with large language models
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped
reward function. Intrinsically motivated exploration methods address this limitation by …
reward function. Intrinsically motivated exploration methods address this limitation by …
A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …
everywhere because of its ability to analyze and create text, images, and beyond. With such …
Clipcap: Clip prefix for image captioning
Image captioning is a fundamental task in vision-language understanding, where the model
predicts a textual informative caption to a given input image. In this paper, we present a …
predicts a textual informative caption to a given input image. In this paper, we present a …
4d-fy: Text-to-4d generation using hybrid score distillation sampling
Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …
Translation between molecules and natural language
We present $\textbf {MolT5} $$-$ a self-supervised learning framework for pretraining
models on a vast amount of unlabeled natural language text and molecule strings. $\textbf …
models on a vast amount of unlabeled natural language text and molecule strings. $\textbf …
Quality not quantity: On the interaction between dataset design and robustness of clip
Web-crawled datasets have enabled remarkable generalization capabilities in recent image-
text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little …
text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little …
Deep learning: Systematic review, models, challenges, and research directions
T Talaei Khoei, H Ould Slimane… - Neural Computing and …, 2023 - Springer
The current development in deep learning is witnessing an exponential transition into
automation applications. This automation transition can provide a promising framework for …
automation applications. This automation transition can provide a promising framework for …