Академия Google

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

Сохранить Цитировать Цитируется: 1482 Похожие статьи Все версии статьи (4)

[Free GPT-4]
[DeepSeek]

[PDF] sciencedirect.com

Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier

Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

Сохранить Цитировать Цитируется: 219 Похожие статьи Все версии статьи (9)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - IEEE transactions on …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Сохранить Цитировать Цитируется: 126 Похожие статьи Все версии статьи (3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tip-adapter: Training-free adaption of clip for few-shot classification

R Zhang, W Zhang, R Fang, P Gao, K Li, J Dai… - European conference on …, 2022 - Springer

Abstract Contrastive Vision-Language Pre-training, known as CLIP, has provided a new
paradigm for learning visual representations using large-scale image-text pairs. It shows …

Сохранить Цитировать Цитируется: 350 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training

R Zhang, Z Guo, P Gao, R Fang… - Advances in neural …, 2022 - proceedings.neurips.cc

Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for
language and 2D image transformers. However, it still remains an open question on how to …

Сохранить Цитировать Цитируется: 266 Похожие статьи Все версии статьи (6) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Pointclip: Point cloud understanding by clip

R Zhang, Z Guo, W Zhang, K Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recently, zero-shot and few-shot learning via Contrastive Vision-Language Pre-training
(CLIP) have shown inspirational performance on 2D visual recognition, which learns to …

Сохранить Цитировать Цитируется: 470 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Conditional detr for fast training convergence

D Meng, X Chen, Z Fan, G Zeng, H Li… - Proceedings of the …, 2021 - openaccess.thecvf.com

The recently-developed DETR approach applies the transformer encoder and decoder
architecture to object detection and achieves promising performance. In this paper, we …

Сохранить Цитировать Цитируется: 747 Похожие статьи Все версии статьи (6) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Understanding the robustness in vision transformers

D Zhou, Z Yu, E **e, C **ao… - International …, 2022 - proceedings.mlr.press

Recent studies show that Vision Transformers (ViTs) exhibit strong robustness against
various corruptions. Although this property is partly attributed to the self-attention …

Сохранить Цитировать Цитируется: 213 Похожие статьи Все версии статьи (6) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org

Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

Сохранить Цитировать Цитируется: 458 Похожие статьи Все версии статьи (22)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cvt: Introducing convolutions to vision transformers

H Wu, B **ao, N Codella, M Liu, X Dai… - Proceedings of the …, 2021 - openaccess.thecvf.com

We present in this paper a new architecture, named Convolutional vision Transformer (CvT),
that improves Vision Transformer (ViT) in performance and efficiency by introducing …

Сохранить Цитировать Цитируется: 2437 Похожие статьи Все версии статьи (7) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

End-to-end object detection with adaptive clustering transformer

[HTML][HTML] A survey of transformers

Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

Transformer-based visual segmentation: A survey

Tip-adapter: Training-free adaption of clip for few-shot classification

Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training

Pointclip: Point cloud understanding by clip

Conditional detr for fast training convergence

Understanding the robustness in vision transformers

A survey of visual transformers

Cvt: Introducing convolutions to vision transformers