Pmr: Prototypical modal rebalance for multimodal learning

Y Fan, W Xu, H Wang, J Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to
compensate for their inherent limitations. However, existing MML methods often optimize a …

Modeling spatial-temporal clues in a hybrid deep learning framework for video classification

Z Wu, X Wang, YG Jiang, H Ye, X Xue - Proceedings of the 23rd ACM …, 2015 - dl.acm.org
Classifying videos according to content semantics is an important problem with a wide range
of applications. In this paper, we propose a hybrid deep learning framework for video …

Exploiting feature and class relationships in video categorization with regularized deep neural networks

YG Jiang, Z Wu, J Wang, X Xue… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
In this paper, we study the challenging problem of categorizing videos according to high-
level semantics such as the existence of a particular human action or a complex event …

Moddrop: adaptive multi-modal gesture recognition

N Neverova, C Wolf, G Taylor… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
We present a method for gesture detection and localisation based on multi-scale and multi-
modal deep learning. Each visual modality captures spatial information at a particular spatial …

Semantic pooling for complex event analysis in untrimmed videos

X Chang, YL Yu, Y Yang… - IEEE transactions on …, 2016 - ieeexplore.ieee.org
Pooling plays an important role in generating a discriminative video representation. In this
paper, we propose a new semantic pooling approach for challenging event analysis tasks …

Multi-stream multi-class fusion of deep networks for video classification

Z Wu, YG Jiang, X Wang, H Ye, X Xue - Proceedings of the 24th ACM …, 2016 - dl.acm.org
This paper studies deep network architectures to address the problem of video classification.
A multi-stream framework is proposed to fully utilize the rich multimodal information in …

Learning to score figure skating sport videos

C Xu, Y Fu, B Zhang, Z Chen… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
This paper aims at learning to score the figure skating sports videos. To address this task,
we propose a deep architecture that includes two complementary components, ie, Self …

Two-stream deep architecture for hyperspectral image classification

S Hao, W Wang, Y Ye, T Nie… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Most traditional approaches classify hyperspectral image (HSI) pixels relying only on the
spectral values of the input channels. However, the spatial context around a pixel is also …

Modeling multimodal clues in a hybrid deep learning framework for video classification

YG Jiang, Z Wu, J Tang, Z Li, X Xue… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Videos are inherently multimodal. This paper studies the problem of exploiting the abundant
multimodal clues for improved video classification performance. We introduce a novel hybrid …

Exploring inter-feature and inter-class relationships with deep neural networks for video classification

Z Wu, YG Jiang, J Wang, J Pu, X Xue - Proceedings of the 22nd ACM …, 2014 - dl.acm.org
Videos contain very rich semantics and are intrinsically multimodal. In this paper, we study
the challenging task of classifying videos according to their high-level semantics such as …