- Academic Search

Y Fan, W Xu, H Wang, J Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to
compensate for their inherent limitations. However, existing MML methods often optimize a …

Lưu Trích dẫn Trích dẫn 69 bài viết Bài viết có liên quan Tất cả 11 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Modeling spatial-temporal clues in a hybrid deep learning framework for video classification

Z Wu, X Wang, YG Jiang, H Ye, X Xue - Proceedings of the 23rd ACM …, 2015 - dl.acm.org

Classifying videos according to content semantics is an important problem with a wide range
of applications. In this paper, we propose a hybrid deep learning framework for video …

Lưu Trích dẫn Trích dẫn 602 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploiting feature and class relationships in video categorization with regularized deep neural networks

YG Jiang, Z Wu, J Wang, X Xue… - IEEE transactions on …, 2017 - ieeexplore.ieee.org

In this paper, we study the challenging problem of categorizing videos according to high-
level semantics such as the existence of a particular human action or a complex event …

Lưu Trích dẫn Trích dẫn 456 bài viết Bài viết có liên quan Tất cả 12 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Moddrop: adaptive multi-modal gesture recognition

N Neverova, C Wolf, G Taylor… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org

We present a method for gesture detection and localisation based on multi-scale and multi-
modal deep learning. Each visual modality captures spatial information at a particular spatial …

Lưu Trích dẫn Trích dẫn 452 bài viết Bài viết có liên quan Tất cả 14 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] nih.gov

Semantic pooling for complex event analysis in untrimmed videos

X Chang, YL Yu, Y Yang… - IEEE transactions on …, 2016 - ieeexplore.ieee.org

Pooling plays an important role in generating a discriminative video representation. In this
paper, we propose a new semantic pooling approach for challenging event analysis tasks …

Lưu Trích dẫn Trích dẫn 353 bài viết Bài viết có liên quan Tất cả 13 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] azurewebsites.net

Multi-stream multi-class fusion of deep networks for video classification

Z Wu, YG Jiang, X Wang, H Ye, X Xue - Proceedings of the 24th ACM …, 2016 - dl.acm.org

This paper studies deep network architectures to address the problem of video classification.
A multi-stream framework is proposed to fully utilize the rich multimodal information in …

Lưu Trích dẫn Trích dẫn 237 bài viết Bài viết có liên quan Tất cả 3 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning to score figure skating sport videos

C Xu, Y Fu, B Zhang, Z Chen… - IEEE transactions on …, 2019 - ieeexplore.ieee.org

This paper aims at learning to score the figure skating sports videos. To address this task,
we propose a deep architecture that includes two complementary components, ie, Self …

Lưu Trích dẫn Trích dẫn 147 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] unitn.it

Two-stream deep architecture for hyperspectral image classification

S Hao, W Wang, Y Ye, T Nie… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org

Most traditional approaches classify hyperspectral image (HSI) pixels relying only on the
spectral values of the input channels. However, the spatial context around a pixel is also …

Lưu Trích dẫn Trích dẫn 142 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Modeling multimodal clues in a hybrid deep learning framework for video classification

YG Jiang, Z Wu, J Tang, Z Li, X Xue… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

Videos are inherently multimodal. This paper studies the problem of exploiting the abundant
multimodal clues for improved video classification performance. We introduce a novel hybrid …

Lưu Trích dẫn Trích dẫn 126 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] fudan.edu.cn

Exploring inter-feature and inter-class relationships with deep neural networks for video classification

Z Wu, YG Jiang, J Wang, J Pu, X Xue - Proceedings of the 22nd ACM …, 2014 - dl.acm.org

Videos contain very rich semantics and are intrinsically multimodal. In this paper, we study
the challenging task of classifying videos according to their high-level semantics such as …

Lưu Trích dẫn Trích dẫn 175 bài viết Bài viết có liên quan Tất cả 4 phiên bản

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Sample-specific late fusion for visual category recognition

Pmr: Prototypical modal rebalance for multimodal learning

Modeling spatial-temporal clues in a hybrid deep learning framework for video classification

Exploiting feature and class relationships in video categorization with regularized deep neural networks

Moddrop: adaptive multi-modal gesture recognition

Semantic pooling for complex event analysis in untrimmed videos

Multi-stream multi-class fusion of deep networks for video classification

Learning to score figure skating sport videos

Two-stream deep architecture for hyperspectral image classification

Modeling multimodal clues in a hybrid deep learning framework for video classification

Exploring inter-feature and inter-class relationships with deep neural networks for video classification