- Academic Search

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2024 - Elsevier

The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Speichern Zitieren Zitiert von: 149 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com

Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

Speichern Zitieren Zitiert von: 608 Ähnliche Artikel Alle 5 Versionen Im Cache

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Vmamba: Visual state space model

Y Liu, Y Tian, Y Zhao, H Yu, L **e… - Advances in neural …, 2025 - proceedings.neurips.cc

Designing computationally efficient network architectures remains an ongoing necessity in
computer vision. In this paper, we adapt Mamba, a state-space language model, into …

Speichern Zitieren Zitiert von: 1038 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Run, don't walk: chasing higher FLOPS for faster neural networks

J Chen, S Kao, H He, W Zhuo, S Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com

To design fast neural networks, many works have been focusing on reducing the number of
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …

Speichern Zitieren Zitiert von: 1201 Ähnliche Artikel Alle 10 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Convnext v2: Co-designing and scaling convnets with masked autoencoders

S Woo, S Debnath, R Hu, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Driven by improved architectures and better representation learning frameworks, the field of
visual recognition has enjoyed rapid modernization and performance boost in the early …

Speichern Zitieren Zitiert von: 693 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Internimage: Exploring large-scale vision foundation models with deformable convolutions

W Wang, J Dai, Z Chen, Z Huang, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …

Speichern Zitieren Zitiert von: 814 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

CY Wang, A Bochkovskiy… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Real-time object detection is one of the most important research topics in computer vision.
As new approaches regarding architecture optimization and training optimization are …

Speichern Zitieren Zitiert von: 9633 Ähnliche Artikel Alle 10 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Maxvit: Multi-axis vision transformer

Z Tu, H Talebi, H Zhang, F Yang, P Milanfar… - European conference on …, 2022 - Springer

Transformers have recently gained significant attention in the computer vision community.
However, the lack of scalability of self-attention mechanisms with respect to image size has …

Speichern Zitieren Zitiert von: 778 Ähnliche Artikel Alle 8 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

Speichern Zitieren Zitiert von: 799 Ähnliche Artikel Alle 8 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

A convnet for the 2020s

Z Liu, H Mao, CY Wu, C Feichtenhofer… - Proceedings of the …, 2022 - openaccess.thecvf.com

The" Roaring 20s" of visual recognition began with the introduction of Vision Transformers
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …

Speichern Zitieren Zitiert von: 6989 Ähnliche Artikel Alle 11 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Efficientnetv2: Smaller models and faster training

Advances in medical image analysis with vision transformers: a comprehensive review

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

Vmamba: Visual state space model

Run, don't walk: chasing higher FLOPS for faster neural networks

Convnext v2: Co-designing and scaling convnets with masked autoencoders

Internimage: Exploring large-scale vision foundation models with deformable convolutions

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Maxvit: Multi-axis vision transformer

Visual attention network

A convnet for the 2020s