A comprehensive survey on applications of transformers for deep learning tasks

S Islam, H Elmekki, A Elsebai, J Bentahar… - Expert Systems with …, 2024 - Elsevier
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …

Deep multimodal data fusion

F Zhao, C Zhang, B Geng - ACM computing surveys, 2024 - dl.acm.org
Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data
(eg, images, texts, or data collected from different sensors), feature engineering (eg …

Multiscale vision transformers

H Fan, B **ong, K Mangalam, Y Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts

S Changpinyo, P Sharma, N Ding… - Proceedings of the …, 2021 - openaccess.thecvf.com
The availability of large-scale image captioning and visual question answering datasets has
contributed significantly to recent successes in vision-and-language pre-training. However …

Task-adaptive attention for image captioning

C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Attention mechanisms are now widely used in image captioning models. However, most
attention models only focus on visual features. When generating syntax related words, little …

Image encryption algorithm based on a 2D-CLSS hyperchaotic map using simultaneous permutation and diffusion

L Teng, X Wang, Y **an - Information Sciences, 2022 - Elsevier
A two-dimensional cross-mode hyperchaotic map based on logistic and sine maps (2D-
CLSS) is presented. The hyperchaotic map consists of a logistic map and two sine maps …

Fine-grained visual classification via internal ensemble learning transformer

Q Xu, J Wang, B Jiang, B Luo - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Recently, vision transformers (ViTs) have been investigated in fine-grained visual
recognition (FGVC) and are now considered state of the art. However, most ViT-based works …

An edge traffic flow detection scheme based on deep learning in an intelligent transportation system

C Chen, B Liu, S Wan, P Qiao… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
An intelligent transportation system (ITS) plays an important role in public transport
management, security and other issues. Traffic flow detection is an important part of the ITS …

Deep CNN for brain tumor classification

W Ayadi, W Elhamzi, I Charfi, M Atri - Neural processing letters, 2021 - Springer
Brain tumor represents one of the most fatal cancers around the world. It is common cancer
in adults and children. It has the lowest survival rate and various types depending on their …

Bagfn: broad attentive graph fusion network for high-order feature interactions

Z **e, W Zhang, B Sheng, P Li… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Modeling feature interactions is of crucial significance to high-quality feature engineering on
multifiled sparse data. At present, a series of state-of-the-art methods extract cross features …