Convolutional neural networks or vision transformers: Who will win the race for action recognitions in visual data?
Understanding actions in videos remains a significant challenge in computer vision, which
has been the subject of several pieces of research in the last decades. Convolutional neural …
has been the subject of several pieces of research in the last decades. Convolutional neural …
Deep learning in food authenticity: Recent advances and future trends
Background The development of fast, efficient, accurate, and reliable techniques and
methods for food authenticity identification is crucial for food quality assurance. Traditional …
methods for food authenticity identification is crucial for food quality assurance. Traditional …
Scaling up your kernels to 31x31: Revisiting large kernel design in cnns
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
Inception transformer
Recent studies show that transformer has strong capability of building long-range
dependencies, yet is incompetent in capturing high frequencies that predominantly convey …
dependencies, yet is incompetent in capturing high frequencies that predominantly convey …
Metaformer baselines for vision
MetaFormer, the abstracted architecture of Transformer, has been found to play a significant
role in achieving competitive performance. In this paper, we further explore the capacity of …
role in achieving competitive performance. In this paper, we further explore the capacity of …
M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice Transformer
In this study, we propose a three-dimensional Medical image classifier using Multi-plane
and Multi-slice Transformer (M3T) network to classify Alzheimer's disease (AD) in 3D MRI …
and Multi-slice Transformer (M3T) network to classify Alzheimer's disease (AD) in 3D MRI …
Bridging the gap between vision transformers and convolutional neural networks on small datasets
There still remains an extreme performance gap between Vision Transformers (ViTs) and
Convolutional Neural Networks (CNNs) when training from scratch on small datasets, which …
Convolutional Neural Networks (CNNs) when training from scratch on small datasets, which …
A tactile oral pad based on carbon nanotubes for multimodal haptic interaction
Wearable systems that incorporate soft tactile sensors that transmit spatio-temporal touch
patterns may be useful in the development of biomedical robotics. Such systems have been …
patterns may be useful in the development of biomedical robotics. Such systems have been …
Peripheral vision transformer
Human vision possesses a special type of visual processing systems called peripheral
vision. Partitioning the entire visual field into multiple contour regions based on the distance …
vision. Partitioning the entire visual field into multiple contour regions based on the distance …
ATFE-Net: Axial Transformer and Feature Enhancement-based CNN for ultrasound breast mass segmentation
Breast mass is one of the main clinical symptoms of breast cancer. Recently, many CNN-
based methods for breast mass segmentation have been proposed. However, these …
based methods for breast mass segmentation have been proposed. However, these …