A survey of the vision transformers and their CNN-transformer based variants

A Khan, Z Rauf, A Sohail, AR Khan, H Asif… - Artificial Intelligence …, 2023 - Springer
Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …

P2T: Pyramid pooling transformer for scene understanding

YH Wu, Y Liu, X Zhan… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Recently, the vision transformer has achieved great success by pushing the state-of-the-art
of various vision tasks. One of the most challenging problems in the vision transformer is that …

Cascast: Skillful high-resolution precipitation nowcasting via cascaded modelling

J Gong, L Bai, P Ye, W Xu, N Liu, J Dai, X Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Precipitation nowcasting based on radar data plays a crucial role in extreme weather
prediction and has broad implications for disaster management. Despite progresses have …

6d-vit: Category-level 6d object pose estimation via transformer-based instance representation learning

L Zou, Z Huang, N Gu, G Wang - IEEE Transactions on Image …, 2022 - ieeexplore.ieee.org
This paper presents 6D vision transformer (6D-ViT), a transformer-based instance
representation learning network suitable for highly accurate category-level object pose …

Transformer-based multi-attention hybrid networks for skin lesion segmentation

Z Dong, J Li, Z Hua - Expert Systems with Applications, 2024 - Elsevier
High-precision segmentation of skin lesions is essential for early diagnosis of skin cancer
and improved patient survival. However, this task becomes challenging due to the …

Ringmo-lite: A remote sensing lightweight network with cnn-transformer hybrid framework

Y Wang, T Zhang, L Zhao, L Hu, Z Wang… - … on Geoscience and …, 2024 - ieeexplore.ieee.org
In recent years, remote sensing (RS) vision foundation models, such as RingMo, have
emerged and achieved excellent performance in various downstream tasks. However, the …

STB-VMM: Swin transformer based video motion magnification

R Lado-Roigé, MA Pérez - Knowledge-Based Systems, 2023 - Elsevier
The goal of video motion magnification techniques is to magnify small motions in a video to
reveal previously invisible or unseen movement. Its uses extend from bio-medical …

Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

MY Sikkandar, SG Sundaram, A Alassaf… - Scientific Reports, 2024 - nature.com
Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it
demands clinical expertise due to the diverse nature of polyps. The recent years have …