A comprehensive survey on source-free domain adaptation

J Li, Z Yu, Z Du, L Zhu, HT Shen - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Over the past decade, domain adaptation has become a widely studied branch of transfer
learning which aims to improve performance on target domains by leveraging knowledge …

Sparsity in transformers: A systematic literature review

M Farina, U Ahmad, A Taha, H Younes, Y Mesbah… - Neurocomputing, 2024 - Elsevier
Transformers have become the state-of-the-art architectures for various tasks in Natural
Language Processing (NLP) and Computer Vision (CV); however, their space and …

Vmamba: Visual state space model

Y Liu, Y Tian, Y Zhao, H Yu, L **e… - Advances in neural …, 2025 - proceedings.neurips.cc
Designing computationally efficient network architectures remains an ongoing necessity in
computer vision. In this paper, we adapt Mamba, a state-space language model, into …

Flatten transformer: Vision transformer using focused linear attention

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

Oneformer: One transformer to rule universal image segmentation

J Jain, J Li, MT Chiu, A Hassani… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Universal Image Segmentation is not a new concept. Past attempts to unify image
segmentation include scene parsing, panoptic segmentation, and, more recently, new …

Agent attention: On the integration of softmax and linear attention

D Han, T Ye, Y Han, Z **a, S Pan, P Wan… - … on Computer Vision, 2024 - Springer
The attention module is the key component in Transformers. While the global attention
mechanism offers high expressiveness, its excessive computational cost restricts its …

Vit-comer: Vision transformer with convolutional multi-scale feature interaction for dense predictions

C **a, X Wang, F Lv, X Hao… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Although Vision Transformer (ViT) has achieved significant success in computer
vision it does not perform well in dense prediction tasks due to the lack of inner-patch …

Metaformer baselines for vision

W Yu, C Si, P Zhou, M Luo, Y Zhou… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
MetaFormer, the abstracted architecture of Transformer, has been found to play a significant
role in achieving competitive performance. In this paper, we further explore the capacity of …

Dilateformer: Multi-scale dilated transformer for visual recognition

J Jiao, YM Tang, KY Lin, Y Gao, AJ Ma… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
As a de facto solution, the vanilla Vision Transformers (ViTs) are encouraged to model long-
range dependencies between arbitrary image patches while the global attended receptive …

Rmt: Retentive networks meet vision transformers

Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …