STAT: Spatial-temporal attention mechanism for video captioning
Video captioning refers to automatic generate natural language sentences, which
summarize the video contents. Inspired by the visual attention mechanism of human beings …
summarize the video contents. Inspired by the visual attention mechanism of human beings …
Unsupervised person re-identification: Clustering and fine-tuning
The superiority of deeply learned pedestrian representations has been reported in very
recent literature of person re-identification (re-ID). In this article, we consider the more …
recent literature of person re-identification (re-ID). In this article, we consider the more …
Deep hyperspectral image sharpening
Hyperspectral image (HSI) sharpening, which aims at fusing an observable low spatial
resolution (LR) HSI (LR-HSI) with a high spatial resolution (HR) multispectral image (HR …
resolution (LR) HSI (LR-HSI) with a high spatial resolution (HR) multispectral image (HR …
Spatial and semantic convolutional features for robust visual object tracking
J Zhang, X **, J Sun, J Wang, AK Sangaiah - Multimedia Tools and …, 2020 - Springer
Robust and accurate visual tracking is a challenging problem in computer vision. In this
paper, we exploit spatial and semantic convolutional features extracted from convolutional …
paper, we exploit spatial and semantic convolutional features extracted from convolutional …
Supervised hash coding with deep neural network for environment perception of intelligent vehicles
Image content analysis is an important surround perception modality of intelligent vehicles.
In order to efficiently recognize the on-road environment based on image content analysis …
In order to efficiently recognize the on-road environment based on image content analysis …
Unsupervised deep video hashing via balanced code for large-scale video retrieval
This paper proposes a deep hashing framework, namely, unsupervised deep video hashing
(UDVH), for large-scale video similarity search with the aim to learn compact yet effective …
(UDVH), for large-scale video similarity search with the aim to learn compact yet effective …
A scalable region-based level set method using adaptive bilateral filter for noisy image segmentation
Image segmentation plays an important role in the computer vision. However, it is extremely
challenging due to low resolution, high noise and blurry boundaries. Recently, region-based …
challenging due to low resolution, high noise and blurry boundaries. Recently, region-based …
Joint transmission map estimation and dehazing using deep networks
Single image haze removal is an extremely challenging problem due to its inherent ill-posed
nature. Several prior-based and learning-based methods have been proposed in the …
nature. Several prior-based and learning-based methods have been proposed in the …
Deep cascade learning
In this paper, we propose a novel approach for efficient training of deep neural networks in a
bottom-up fashion using a layered structure. Our algorithm, which we refer to as deep …
bottom-up fashion using a layered structure. Our algorithm, which we refer to as deep …
Optimization of deep convolutional neural network for large scale image retrieval
Feature extraction and similarity measurement are two key steps in image retrieval. AlexNet
is a classical deep convolutional neural network for image classification, but using it directly …
is a classical deep convolutional neural network for image classification, but using it directly …