- Academic Search

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2024 - Elsevier

The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Save Cite Cited by 149 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Wave-vit: Unifying wavelet and transformers for visual representation learning

T Yao, Y Pan, Y Li, CW Ngo, T Mei - European Conference on Computer …, 2022 - Springer

Abstract Multi-scale Vision Transformer (ViT) has emerged as a powerful backbone for
computer vision tasks, while the self-attention computation in Transformer scales …

Save Cite Cited by 161 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Metaformer baselines for vision

W Yu, C Si, P Zhou, M Luo, Y Zhou… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

MetaFormer, the abstracted architecture of Transformer, has been found to play a significant
role in achieving competitive performance. In this paper, we further explore the capacity of …

Save Cite Cited by 169 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Rmt: Retentive networks meet vision transformers

Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …

Save Cite Cited by 68 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A survey of the vision transformers and their CNN-transformer based variants

A Khan, Z Rauf, A Sohail, AR Khan, H Asif… - Artificial Intelligence …, 2023 - Springer

Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …

Save Cite Cited by 107 Related articles All 6 versions Free GPT-4

CRFormer: cross-resolution transformer for segmentation of grape leaf diseases with context mining

X Zhang, C Cen, F Li, M Liu, W Mu - Expert Systems with Applications, 2023 - Elsevier

In the smart agriculture community, automatic segmentation is an important basis for plant
disease detection and identification. However, the complex background and texturally rich …

Save Cite Cited by 17 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Learning orthogonal prototypes for generalized few-shot semantic segmentation

SA Liu, Y Zhang, Z Qiu, H **e… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generalized few-shot semantic segmentation (GFSS) distinguishes pixels of base and novel
classes from the background simultaneously, conditioning on sufficient data of base classes …

Save Cite Cited by 41 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Objectfusion: Multi-modal 3d object detection with object-centric fusion

Q Cai, Y Pan, T Yao, CW Ngo… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recent progress on multi-modal 3D object detection has featured BEV (Bird-Eye-View)
based fusion, which effectively unifies both LiDAR point clouds and camera images in a …

Save Cite Cited by 29 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A data-scalable transformer for medical image segmentation: architecture, model efficiency, and benchmark

Y Gao, M Zhou, D Liu, Z Yan, S Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org

Transformers have demonstrated remarkable performance in natural language processing
and computer vision. However, existing vision Transformers struggle to learn from limited …

Save Cite Cited by 113 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Control3d: Towards controllable text-to-3d generation

Y Chen, Y Pan, Y Li, T Yao, T Mei - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Recent remarkable advances in large-scale text-to-image diffusion models have inspired a
significant breakthrough in text-to-3D generation, pursuing 3D content creation solely from a …

Save Cite Cited by 42 Related articles All 4 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Dual vision transformer

Advances in medical image analysis with vision transformers: a comprehensive review

Wave-vit: Unifying wavelet and transformers for visual representation learning

Metaformer baselines for vision

Rmt: Retentive networks meet vision transformers

A survey of the vision transformers and their CNN-transformer based variants

CRFormer: cross-resolution transformer for segmentation of grape leaf diseases with context mining

Learning orthogonal prototypes for generalized few-shot semantic segmentation

Objectfusion: Multi-modal 3d object detection with object-centric fusion

A data-scalable transformer for medical image segmentation: architecture, model efficiency, and benchmark

Control3d: Towards controllable text-to-3d generation