Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2024 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Artificial intelligence for remote sensing data analysis: A review of challenges and opportunities

L Zhang, L Zhang - IEEE Geoscience and Remote Sensing …, 2022 - ieeexplore.ieee.org
Artificial intelligence (AI) plays a growing role in remote sensing (RS). Applications of AI,
particularly machine learning algorithms, range from initial image processing to high-level …

Zegclip: Towards adapting clip for zero-shot semantic segmentation

Z Zhou, Y Lei, B Zhang, L Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recently, CLIP has been applied to pixel-level zero-shot learning tasks via a wo-stage
scheme. The general idea is to first generate class-agnostic region proposals and then feed …

Rethinking semantic segmentation: A prototype view

T Zhou, W Wang, E Konukoglu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Prevalent semantic segmentation solutions, despite their different network designs (FCN
based or attention based) and mask decoding strategies (parametric softmax based or pixel …

A review of convolutional neural network architectures and their optimizations

S Cong, Y Zhou - Artificial Intelligence Review, 2023 - Springer
The research advances concerning the typical architectures of convolutional neural
networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …

Simam: A simple, parameter-free attention module for convolutional neural networks

L Yang, RY Zhang, L Li, X **e - International conference on …, 2021 - proceedings.mlr.press
In this paper, we propose a conceptually simple but very effective attention module for
Convolutional Neural Networks (ConvNets). In contrast to existing channel-wise and spatial …

SegFormer: Simple and efficient design for semantic segmentation with transformers

E **e, W Wang, Z Yu, A Anandkumar… - Advances in neural …, 2021 - proceedings.neurips.cc
We present SegFormer, a simple, efficient yet powerful semantic segmentation framework
which unifies Transformers with lightweight multilayer perceptron (MLP) decoders …

Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication

R Xu, H **ang, X **a, X Han, J Li… - … Conference on Robotics …, 2022 - ieeexplore.ieee.org
Employing Vehicle-to-Vehicle communication to enhance perception performance in self-
driving technology has attracted considerable attention recently; however, the absence of a …

Efficientnetv2: Smaller models and faster training

M Tan, Q Le - International conference on machine learning, 2021 - proceedings.mlr.press
This paper introduces EfficientNetV2, a new family of convolutional networks that have faster
training speed and better parameter efficiency than previous models. To develop these …

Pyramid vision transformer: A versatile backbone for dense prediction without convolutions

W Wang, E **e, X Li, DP Fan, K Song… - Proceedings of the …, 2021 - openaccess.thecvf.com
Although convolutional neural networks (CNNs) have achieved great success in computer
vision, this work investigates a simpler, convolution-free backbone network useful for many …