[HTML][HTML] New ideas and trends in deep multimodal content understanding: A review

W Chen, W Wang, L Liu, MS Lew - Neurocomputing, 2021 - Elsevier
The focus of this survey is on the analysis of two modalities of multimodal deep learning:
image and text. Unlike classic reviews of deep learning where monomodal image classifiers …

Modeling multi-label action dependencies for temporal action localization

P Tirupattur, K Duarte, YS Rawat… - Proceedings of the …, 2021 - openaccess.thecvf.com
Real world videos contain many complex actions with inherent relationships between action
classes. In this work, we propose an attention-based architecture that model these action …

Order-free rnn with visual attention for multi-label classification

SF Chen, YC Chen, CK Yeh, YC Wang - Proceedings of the AAAI …, 2018 - ojs.aaai.org
We propose a recurrent neural network (RNN) based model for image multi-label
classification. Our model uniquely integrates and learning of visual attention and Long Short …

Wavelet convolutional neural networks

S Fujieda, K Takayama, T Hachisuka - ar**_urban_impervious_surface_by_fusing_optical_and_SAR_data_based_on_the_random_forests_and_D-S_theory/links/5b2e86d94585150d23ca9ef3/Map**-urban-impervious-surface-by-fusing-optical-and-SAR-data-based-on-the-random-forests-and-D-S-theory.pdf" data-clk="hl=en&sa=T&oi=gga&ct=gga&cd=5&d=6114350579894116231&ei=aMKlZ5mwLpmp6rQPqKK8iA8" data-clk-atid="h9_KCnWJ2lQJ" target="_blank">[PDF] researchgate.net

A survey and analysis on automatic image annotation

Q Cheng, Q Zhang, P Fu, C Tu, S Li - Pattern Recognition, 2018 - Elsevier
In recent years, image annotation has attracted extensive attention due to the explosive
growth of image data. With the capability of describing images at the semantic level, image …

[PDF][PDF] Semi-Supervised Robust Deep Neural Networks for Multi-Label Classification.

H Cevikalp, B Benligiray, ÖN Gerek… - CVPR …, 2019 - openaccess.thecvf.com
In this paper, we propose a robust method for semisupervised training of deep neural
networks for multi-label image classification. To this end, we use ramp loss, which is more …

Semantic regularisation for recurrent image annotation

F Liu, T **ang, TM Hospedales… - Proceedings of the …, 2017 - openaccess.thecvf.com
The" CNN-RNN" design pattern is increasingly widely applied in a variety of image
annotation tasks including multi-label classification and captioning. Existing models use the …

Tensor normalization and full distribution training

W Fuhl - arxiv preprint arxiv:2109.02345, 2021 - arxiv.org
In this work, we introduce pixel wise tensor normalization, which is inserted after rectifier
linear units and, together with batch normalization, provides a significant improvement in the …

Double attention based on graph attention network for image multi-label classification

W Zhou, Z **a, P Dou, T Su, H Hu - ACM Transactions on Multimedia …, 2023 - dl.acm.org
The task of image multi-label classification is to accurately recognize multiple objects in an
input image. Most of the recent works need to leverage the label co-occurrence matrix …