Multimodal image synthesis and editing: A survey and taxonomy
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …
among multimodal information plays a key role for the creation and perception of multimodal …
Deep learning for anomaly detection: A review
Anomaly detection, aka outlier detection or novelty detection, has been a lasting yet active
research area in various research communities for several decades. There are still some …
research area in various research communities for several decades. There are still some …
Autoregressive image generation without vector quantization
Conventional wisdom holds that autoregressive models for image generation are typically
accompanied by vector-quantized tokens. We observe that while a discrete-valued space …
accompanied by vector-quantized tokens. We observe that while a discrete-valued space …
Regularized vector quantization for tokenized image synthesis
Quantizing images into discrete representations has been a fundamental problem in unified
generative modeling. Predominant approaches learn the discrete representation either in a …
generative modeling. Predominant approaches learn the discrete representation either in a …
[CITAS][C] An introduction to variational autoencoders
An Introduction to Variational Autoencoders Page 1 An Introduction to Variational Autoencoders
Page 2 Other titles in Foundations and Trends R in Machine Learning Computational Optimal …
Page 2 Other titles in Foundations and Trends R in Machine Learning Computational Optimal …
Deep learning in multi-object detection and tracking: state of the art
Object detection and tracking is one of the most important and challenging branches in
computer vision, and have been widely applied in various fields, such as health-care …
computer vision, and have been widely applied in various fields, such as health-care …
Attention, please! A survey of neural attention models in deep learning
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …
limited ability to process competing sources, attention mechanisms select, modulate, and …
A comprehensive review on deep learning-based methods for video anomaly detection
Video surveillance systems are popular and used in public places such as market places,
shop** malls, hospitals, banks, streets, education institutions, city administrative offices …
shop** malls, hospitals, banks, streets, education institutions, city administrative offices …
Neural discrete representation learning
Learning useful representations without supervision remains a key challenge in machine
learning. In this paper, we propose a simple yet powerful generative model that learns such …
learning. In this paper, we propose a simple yet powerful generative model that learns such …
Categorical reparameterization with gumbel-softmax
Categorical variables are a natural choice for representing discrete structure in the world.
However, stochastic neural networks rarely use categorical latent variables due to the …
However, stochastic neural networks rarely use categorical latent variables due to the …