Transformers in vision: A survey
Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …
vision community to study their application to computer vision problems. Among their salient …
Multimodal image synthesis and editing: A survey and taxonomy
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …
among multimodal information plays a key role for the creation and perception of multimodal …
Styleclip: Text-driven manipulation of stylegan imagery
Inspired by the ability of StyleGAN to generate highly re-alistic images in a variety of
domains, much recent work hasfocused on understanding how to use the latent spaces …
domains, much recent work hasfocused on understanding how to use the latent spaces …
Understanding and creating art with AI: Review and outlook
Technologies related to artificial intelligence (AI) have a strong impact on the changes of
research and creative practices in visual arts. The growing number of research initiatives …
research and creative practices in visual arts. The growing number of research initiatives …
Frozen pretrained transformers as universal computation engines
We investigate the capability of a transformer pretrained on natural language to generalize
to other modalities with minimal finetuning--in particular, without finetuning of the self …
to other modalities with minimal finetuning--in particular, without finetuning of the self …
Attention, please! A survey of neural attention models in deep learning
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …
limited ability to process competing sources, attention mechanisms select, modulate, and …
Symbolic music generation with diffusion models
Score-based generative models and diffusion probabilistic models have been successful at
generating high-quality samples in continuous domains such as images and audio …
generating high-quality samples in continuous domains such as images and audio …
Generating images with sparse representations
The high dimensionality of images presents architecture and sampling-efficiency challenges
for likelihood-based generative models. Previous approaches such as VQ-VAE use deep …
for likelihood-based generative models. Previous approaches such as VQ-VAE use deep …
How to Protect Copyright Data in Optimization of Large Language Models?
The softmax operator is a crucial component of large language models (LLMs), which have
played a transformative role in computer research. Due to the centrality of the softmax …
played a transformative role in computer research. Due to the centrality of the softmax …
Attention approximates sparse distributed memory
While Attention has come to be an important mechanism in deep learning, there remains
limited intuition for why it works so well. Here, we show that Transformer Attention can be …
limited intuition for why it works so well. Here, we show that Transformer Attention can be …