Google Academic

S Jamil, M Jalil Piran, OJ Kwon - Drones, 2023 - mdpi.com

As a special type of transformer, vision transformers (ViTs) can be used for various computer
vision (CV) applications. Convolutional neural networks (CNNs) have several potential …

Salvați Citați Citat de 62 ori Articole cu conținut similar Toate cele 9 versiuni În cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on transformer compression

Y Tang, Y Wang, J Guo, Z Tu, K Han, H Hu… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformer plays a vital role in the realms of natural language processing (NLP) and
computer vision (CV), specially for constructing large language models (LLM) and large …

Salvați Citați Citat de 33 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Flatten transformer: Vision transformer using focused linear attention

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

Salvați Citați Citat de 198 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Adaptive rotated convolution for rotated object detection

Y Pu, Y Wang, Z **a, Y Han, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Rotated object detection aims to identify and locate objects in images with arbitrary
orientation. In this scenario, the oriented directions of objects vary considerably across …

Salvați Citați Citat de 105 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Vision transformer with deformable attention

Z **a, X Pan, S Song, LE Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Transformers have recently shown superior performances on various vision tasks. The large,
sometimes even global, receptive field endows Transformer models with higher …

Salvați Citați Citat de 674 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Rank-DETR for high quality object detection

Y Pu, W Liang, Y Hao, Y Yuan… - Advances in …, 2023 - proceedings.neurips.cc

Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …

Salvați Citați Citat de 62 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org

Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

Salvați Citați Citat de 346 ori Articole cu conținut similar Toate cele 8 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Gsva: Generalized segmentation via multimodal large language models

Z **a, D Han, Y Han, X Pan, S Song… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Generalized Referring Expression Segmentation (GRES) extends the scope of
classic RES to refer to multiple objects in one expression or identify the empty targets absent …

Salvați Citați Citat de 42 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Flexivit: One model for all patch sizes

L Beyer, P Izmailov, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision Transformers convert images to sequences by slicing them into patches. The size of
these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher …

Salvați Citați Citat de 103 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Which tokens to use? investigating token reduction in vision transformers

JB Haurum, S Escalera, GW Taylor… - Proceedings of the …, 2023 - openaccess.thecvf.com

Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs
more efficient by removing redundant information in the processed tokens. While different …

Salvați Citați Citat de 40 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Not all images are worth 16x16 words: Dynamic transformers for efficient image recognition

A comprehensive survey of transformers for computer vision

A survey on transformer compression

Flatten transformer: Vision transformer using focused linear attention

Adaptive rotated convolution for rotated object detection

Vision transformer with deformable attention

Rank-DETR for high quality object detection

A survey of visual transformers

Gsva: Generalized segmentation via multimodal large language models

Flexivit: One model for all patch sizes

Which tokens to use? investigating token reduction in vision transformers