Transformer for object detection: Review and benchmark

Y Li, N Miao, L Ma, F Shuang, X Huang - Engineering Applications of …, 2023 - Elsevier
Object detection is a crucial task in computer vision (CV). With the rapid advancement of
Transformer-based models in natural language processing (NLP) and various visual tasks …

Vision transformers for dense prediction: A survey

S Zuo, Y **ao, X Chang, X Wang - Knowledge-based systems, 2022 - Elsevier
Transformers have demonstrated impressive expressiveness and transfer capability in
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …

AFPN: Asymptotic feature pyramid network for object detection

G Yang, J Lei, Z Zhu, S Cheng, Z Feng… - … on Systems, Man, and …, 2023 - ieeexplore.ieee.org
Multi-scale features are of great importance in encoding objects with scale variance in object
detection tasks. A common strategy for multi-scale feature extraction is adopting the classic …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - IEEE transactions on …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Centralized feature pyramid for object detection

Y Quan, D Zhang, L Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The visual feature pyramid has shown its superiority in both effectiveness and efficiency in a
variety of applications. However, current methods overly focus on inter-layer feature …

SwinSUNet: Pure transformer network for remote sensing image change detection

C Zhang, L Wang, S Cheng, Y Li - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Convolutional neural network (CNN) can extract effective semantic features, so it was widely
used for remote sensing image change detection (CD) in the latest years. CNN has acquired …

On the integration of self-attention and convolution

X Pan, C Ge, R Lu, S Song, G Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Convolution and self-attention are two powerful techniques for representation learning, and
they are usually considered as two peer approaches that are distinct from each other. In this …

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org
Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

Motr: End-to-end multiple-object tracking with transformer

F Zeng, B Dong, Y Zhang, T Wang, X Zhang… - European conference on …, 2022 - Springer
Temporal modeling of objects is a key challenge in multiple-object tracking (MOT). Existing
methods track by associating detections through motion-based and appearance-based …

Remote sensing image change detection with transformers

H Chen, Z Qi, Z Shi - IEEE Transactions on Geoscience and …, 2021 - ieeexplore.ieee.org
Modern change detection (CD) has achieved remarkable success by the powerful
discriminative ability of deep convolutions. However, high-resolution remote sensing CD …