Convolutional neural networks or vision transformers: Who will win the race for action recognitions in visual data?

O Moutik, H Sekkat, S Tigani, A Chehri, R Saadane… - Sensors, 2023 - mdpi.com
Understanding actions in videos remains a significant challenge in computer vision, which
has been the subject of several pieces of research in the last decades. Convolutional neural …

Deep learning in food authenticity: Recent advances and future trends

Z Deng, T Wang, Y Zheng, W Zhang, YH Yun - Trends in Food Science & …, 2024 - Elsevier
Background The development of fast, efficient, accurate, and reliable techniques and
methods for food authenticity identification is crucial for food quality assurance. Traditional …

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

Inception transformer

C Si, W Yu, P Zhou, Y Zhou… - Advances in Neural …, 2022 - proceedings.neurips.cc
Recent studies show that transformer has strong capability of building long-range
dependencies, yet is incompetent in capturing high frequencies that predominantly convey …

Metaformer baselines for vision

W Yu, C Si, P Zhou, M Luo, Y Zhou… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
MetaFormer, the abstracted architecture of Transformer, has been found to play a significant
role in achieving competitive performance. In this paper, we further explore the capacity of …

M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice Transformer

J Jang, D Hwang - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
In this study, we propose a three-dimensional Medical image classifier using Multi-plane
and Multi-slice Transformer (M3T) network to classify Alzheimer's disease (AD) in 3D MRI …

Bridging the gap between vision transformers and convolutional neural networks on small datasets

Z Lu, H **e, C Liu, Y Zhang - Advances in Neural …, 2022 - proceedings.neurips.cc
There still remains an extreme performance gap between Vision Transformers (ViTs) and
Convolutional Neural Networks (CNNs) when training from scratch on small datasets, which …

A tactile oral pad based on carbon nanotubes for multimodal haptic interaction

B Hou, D Yang, X Ren, L Yi, X Liu - Nature Electronics, 2024 - nature.com
Wearable systems that incorporate soft tactile sensors that transmit spatio-temporal touch
patterns may be useful in the development of biomedical robotics. Such systems have been …

Peripheral vision transformer

J Min, Y Zhao, C Luo, M Cho - Advances in Neural …, 2022 - proceedings.neurips.cc
Human vision possesses a special type of visual processing systems called peripheral
vision. Partitioning the entire visual field into multiple contour regions based on the distance …

ATFE-Net: Axial Transformer and Feature Enhancement-based CNN for ultrasound breast mass segmentation

Z Ma, Y Qi, C Xu, W Zhao, M Lou, Y Wang… - Computers in Biology and …, 2023 - Elsevier
Breast mass is one of the main clinical symptoms of breast cancer. Recently, many CNN-
based methods for breast mass segmentation have been proposed. However, these …