Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

A review on the attention mechanism of deep learning

Z Niu, G Zhong, H Yu - Neurocomputing, 2021 - Elsevier
Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …

Internimage: Exploring large-scale vision foundation models with deformable convolutions

W Wang, J Dai, Z Chen, Z Huang, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

Convolutional neural networks: A survey

M Krichen - Computers, 2023 - mdpi.com
Artificial intelligence (AI) has become a cornerstone of modern technology, revolutionizing
industries from healthcare to finance. Convolutional neural networks (CNNs) are a subset of …

Dynamic neural networks: A survey

Y Han, G Huang, S Song, L Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Dynamic neural network is an emerging research topic in deep learning. Compared to static
models which have fixed computational graphs and parameters at the inference stage …

Fcanet: Frequency channel attention networks

Z Qin, P Zhang, F Wu, X Li - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Attention mechanism, especially channel attention, has gained great success in the
computer vision field. Many works focus on how to design efficient channel attention …

Deformable detr: Deformable transformers for end-to-end object detection

X Zhu, W Su, L Lu, B Li, X Wang, J Dai - arxiv preprint arxiv:2010.04159, 2020 - arxiv.org
DETR has been recently proposed to eliminate the need for many hand-designed
components in object detection while demonstrating good performance. However, it suffers …

Max-deeplab: End-to-end panoptic segmentation with mask transformers

H Wang, Y Zhu, H Adam, A Yuille… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract We present MaX-DeepLab, the first end-to-end model for panoptic segmentation.
Our approach simplifies the current pipeline that depends heavily on surrogate sub-tasks …

Involution: Inverting the inherence of convolution for visual recognition

D Li, J Hu, C Wang, X Li, Q She, L Zhu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Convolution has been the core ingredient of modern neural networks, triggering the surge of
deep learning in vision. In this work, we rethink the inherent principles of standard …