Object detection using deep learning, CNNs and vision transformers: A review
Detecting objects remains one of computer vision and image understanding applications'
most fundamental and challenging aspects. Significant advances in object detection have …
most fundamental and challenging aspects. Significant advances in object detection have …
Automatic analysis of facial actions: A survey
As one of the most comprehensive and objective ways to describe facial expressions, the
Facial Action Coding System (FACS) has recently received significant attention. Over the …
Facial Action Coding System (FACS) has recently received significant attention. Over the …
Context encoding for semantic segmentation
Recent work has made significant progress in improving spatial resolution for pixelwise
labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous …
labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous …
Unsupervised visual representation learning by context prediction
This work explores the use of spatial context as a source of free and plentiful supervisory
signal for training a rich visual representation. Given only a large, unlabeled image …
signal for training a rich visual representation. Given only a large, unlabeled image …
Vision transformers for remote sensing image classification
In this paper, we propose a remote-sensing scene-classification method based on vision
transformers. These types of networks, which are now recognized as state-of-the-art models …
transformers. These types of networks, which are now recognized as state-of-the-art models …
Visual relationship detection with language priors
Visual relationships capture a wide variety of interactions between pairs of objects in images
(eg “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible …
(eg “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible …
[BOOK][B] Computer vision: algorithms and applications
R Szeliski - 2022 - books.google.com
Humans perceive the three-dimensional structure of the world with apparent ease. However,
despite all of the recent advances in computer vision research, the dream of having a …
despite all of the recent advances in computer vision research, the dream of having a …
Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories
This paper presents a method for recognizing scene categories based on approximate
global geometric correspondence. This technique works by partitioning the image into …
global geometric correspondence. This technique works by partitioning the image into …
Unsupervised learning of visual representations using videos
Is strong supervision necessary for learning a good visual representation? Do we really
need millions of semantically-labeled images to train a Convolutional Neural Network …
need millions of semantically-labeled images to train a Convolutional Neural Network …
LabelMe: a database and web-based tool for image annotation
We seek to build a large collection of images with ground truth labels to be used for object
detection and recognition research. Such data is useful for supervised learning and …
detection and recognition research. Such data is useful for supervised learning and …