Deep multimodal learning: A survey on recent advances and trends

D Ramachandram, GW Taylor - IEEE signal processing …, 2017 - ieeexplore.ieee.org
The success of deep learning has been a catalyst to solving increasingly complex machine-
learning problems, which often involve multiple data modalities. We review recent advances …

Towards robust pattern recognition: A review

XY Zhang, CL Liu, CY Suen - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
The accuracies for many pattern recognition tasks have increased rapidly year by year,
achieving or even outperforming human performance. From the perspective of accuracy …

Visual place recognition: A survey from deep learning perspective

X Zhang, L Wang, Y Su - Pattern Recognition, 2021 - Elsevier
Visual place recognition has attracted widespread research interest in multiple fields such
as computer vision and robotics. Recently, researchers have employed advanced deep …

Jointly learning heterogeneous features for RGB-D activity recognition

JF Hu, WS Zheng, J Lai… - Proceedings of the IEEE …, 2015 - openaccess.thecvf.com
In this paper, we focus on heterogeneous feature learning for RGB-D activity recognition.
Considering that features from different channels could share some similar hidden …

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

H Chen, Y Li, D Su - Pattern Recognition, 2019 - Elsevier
Paired RGB and depth images are becoming popular multi-modal data adopted in computer
vision tasks. Traditional methods based on Convolutional Neural Networks (CNNs) typically …

Three-stream attention-aware network for RGB-D salient object detection

H Chen, Y Li - IEEE Transactions on Image Processing, 2019 - ieeexplore.ieee.org
Previous RGB-D fusion systems based on convolutional neural networks typically employ a
two-stream architecture, in which RGB and depth inputs are learned independently. The …

Progressively complementarity-aware fusion network for RGB-D salient object detection

H Chen, Y Li - Proceedings of the IEEE conference on …, 2018 - openaccess.thecvf.com
How to incorporate cross-modal complementarity sufficiently is the cornerstone question for
RGB-D salient object detection. Previous works mainly address this issue by simply …

Multi-view classification with convolutional neural networks

M Seeland, P Mäder - Plos one, 2021 - journals.plos.org
Humans' decision making process often relies on utilizing visual information from different
views or perspectives. However, in machine-learning-based image classification we …

Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker

X Lan, S Zhang, PC Yuen… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
The use of multiple features has been shown to be an effective strategy for visual tracking
because of their complementary contributions to appearance modeling. The key problem is …

Learning common and specific features for RGB-D semantic segmentation with deconvolutional networks

J Wang, Z Wang, D Tao, S See, G Wang - Computer Vision–ECCV 2016 …, 2016 - Springer
In this paper, we tackle the problem of RGB-D semantic segmentation of indoor images. We
take advantage of deconvolutional networks which can predict pixel-wise class labels, and …