Google 학술 검색

R Azad, EK Aghdam, A Rauland, Y Jia… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Automatic medical image segmentation is a crucial topic in the medical domain and
successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the …

저장 인용 276회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

저장 인용 1458회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]

[PDF] arxiv.org

Vision transformer adapter for dense predictions

Z Chen, Y Duan, W Wang, J He, T Lu, J Dai… - arxiv preprint arxiv …, 2022 - arxiv.org

This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …

저장 인용 615회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Maxvit: Multi-axis vision transformer

Z Tu, H Talebi, H Zhang, F Yang, P Milanfar… - European conference on …, 2022 - Springer

Transformers have recently gained significant attention in the computer vision community.
However, the lack of scalability of self-attention mechanisms with respect to image size has …

저장 인용 769회 인용 관련 학술자료 전체 8개의 버전

[Free GPT-4]

[PDF] thecvf.com

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

[Free GPT-4]

[PDF] thecvf.com

Point Transformer V3: Simpler Faster Stronger

X Wu, L Jiang, PS Wang, Z Liu, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper is not motivated to seek innovation within the attention mechanism. Instead it
focuses on overcoming the existing trade-offs between accuracy and efficiency within the …

저장 인용 172회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Davit: Dual attention vision transformers

M Ding, B **ao, N Codella, P Luo, J Wang… - European conference on …, 2022 - Springer

In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective
vision transformer architecture that is able to capture global context while maintaining …

저장 인용 345회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]

[PDF] thecvf.com

Cmt: Convolutional neural networks meet vision transformers

J Guo, K Han, H Wu, Y Tang, X Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

Vision transformers have been successfully applied to image recognition tasks due to their
ability to capture long-range dependencies within an image. However, there are still gaps in …

저장 인용 866회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]

[PDF] thecvf.com

Cswin transformer: A general vision transformer backbone with cross-shaped windows

X Dong, J Bao, D Chen, W Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract We present CSWin Transformer, an efficient and effective Transformer-based
backbone for general-purpose vision tasks. A challenging issue in Transformer design is …

[Free GPT-4]

[PDF] thecvf.com

Extracting motion and appearance via inter-frame attention for efficient video frame interpolation

G Zhang, Y Zhu, H Wang, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Effectively extracting inter-frame motion and appearance information is important for video
frame interpolation (VFI). Previous works either extract both types of information in a mixed …

저장 인용 110회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Conditional positional encodings for vision transformers

Medical image segmentation review: The success of u-net

[HTML][HTML] A survey of transformers

Vision transformer adapter for dense predictions

Maxvit: Multi-axis vision transformer

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

Point Transformer V3: Simpler Faster Stronger

Davit: Dual attention vision transformers

Cmt: Convolutional neural networks meet vision transformers

Cswin transformer: A general vision transformer backbone with cross-shaped windows

Extracting motion and appearance via inter-frame attention for efficient video frame interpolation