- Academic Search

A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Save Cite Cited by 71 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Swin transformer: Hierarchical vision transformer using shifted windows

Z Liu, Y Lin, Y Cao, H Hu, Y Wei… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper presents a new vision Transformer, called Swin Transformer, that capably serves
as a general-purpose backbone for computer vision. Challenges in adapting Transformer …

Save Cite Cited by 26848 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Save Cite Cited by 2678 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Adaptformer: Adapting vision transformers for scalable visual recognition

S Chen, C Ge, Z Tong, J Wang… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Pretraining Vision Transformers (ViTs) has achieved great success in visual
recognition. A following scenario is to adapt a ViT to various image and video recognition …

Save Cite Cited by 611 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Cmt: Convolutional neural networks meet vision transformers

J Guo, K Han, H Wu, Y Tang, X Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

Vision transformers have been successfully applied to image recognition tasks due to their
ability to capture long-range dependencies within an image. However, there are still gaps in …

Save Cite Cited by 870 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Conformer: Local features coupling global representations for visual recognition

Z Peng, W Huang, S Gu, L **e… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Within Convolutional Neural Network (CNN), the convolution operations are good
at extracting local features but experience difficulty to capture global representations. Within …

Save Cite Cited by 821 Related articles All 14 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Dynamic head: Unifying object detection heads with attentions

X Dai, Y Chen, B **ao, D Chen, M Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com

The complex nature of combining localization and classification in object detection has
resulted in the flourished development of methods. Previous works tried to improve the …

Save Cite Cited by 764 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Learning to prompt for open-vocabulary object detection with vision-language model

Y Du, F Wei, Z Zhang, M Shi… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recently, vision-language pre-training shows great potential in open-vocabulary object
detection, where detectors trained on base classes are devised for detecting new classes …

Save Cite Cited by 357 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Rethinking transformer-based set prediction for object detection

Z Sun, S Cao, Y Yang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

DETR is a recently proposed Transformer-based method which views object detection as a
set prediction problem and achieves state-of-the-art performance but demands extra-long …

Save Cite Cited by 411 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arxiv preprint arxiv …, 2020 - arxiv.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Save Cite Cited by 389 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Relationnet++: Bridging visual representations for object detection via transformer decoder

A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

Swin transformer: Hierarchical vision transformer using shifted windows

A survey on vision transformer

Adaptformer: Adapting vision transformers for scalable visual recognition

Cmt: Convolutional neural networks meet vision transformers

Conformer: Local features coupling global representations for visual recognition

Dynamic head: Unifying object detection heads with attentions

Learning to prompt for open-vocabulary object detection with vision-language model

Rethinking transformer-based set prediction for object detection

A survey on visual transformer