Google Академик

Vision permutator: A permutable mlp-like architecture for visual recognition

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Adventures in data analysis: A systematic review of Deep Learning techniques for pattern recognition in cyber-physical-social systems

Z Amiri, A Heidari, NJ Navimipour, M Unal… - Multimedia Tools and …, 2024 - Springer

Abstract Machine Learning (ML) and Deep Learning (DL) have achieved high success in
many textual, auditory, medical imaging, and visual recognition patterns. Concerning the …

Сачувај Цитирај 112 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[PDF] cell.com Full View

Are we ready for a new paradigm shift? a survey on visual deep mlp

R Liu, Y Li, L Tao, D Liang, HT Zheng - Patterns, 2022 - cell.com

Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of
interest in the vision community. Historically, the availability of larger datasets combined with …

Сачувај Цитирај 88 пута наведен Сродни чланци Све верзије (8)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Scaling & shifting your features: A new baseline for efficient model tuning

D Lian, D Zhou, J Feng, X Wang - Advances in Neural …, 2022 - proceedings.neurips.cc

Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-
tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers …

Сачувај Цитирај 246 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Davit: Dual attention vision transformers

M Ding, B **ao, N Codella, P Luo, J Wang… - European conference on …, 2022 - Springer

In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective
vision transformer architecture that is able to capture global context while maintaining …

Сачувај Цитирај 358 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Metaformer baselines for vision

W Yu, C Si, P Zhou, M Luo, Y Zhou… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

MetaFormer, the abstracted architecture of Transformer, has been found to play a significant
role in achieving competitive performance. In this paper, we further explore the capacity of …

Сачувај Цитирај 183 пута наведен Сродни чланци Све верзије (11)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Metaformer is actually what you need for vision

W Yu, M Luo, P Zhou, C Si, Y Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com

Transformers have shown great potential in computer vision tasks. A common belief is their
attention-based token mixer module contributes most to their competence. However, recent …

Сачувај Цитирај 1147 пута наведен Сродни чланци Све верзије (10) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Centralized feature pyramid for object detection

Y Quan, D Zhang, L Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The visual feature pyramid has shown its superiority in both effectiveness and efficiency in a
variety of applications. However, current methods overly focus on inter-layer feature …

Сачувај Цитирај 193 пута наведен Сродни чланци Све верзије (7)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Focal modulation networks

J Yang, C Li, X Dai, J Gao - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose focal modulation networks (FocalNets in short), where self-attention (SA) is
completely replaced by a focal modulation module for modeling token interactions in vision …

Сачувај Цитирај 290 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Delivering arbitrary-modal semantic segmentation

J Zhang, R Liu, H Shi, K Yang, S Reiß… - Proceedings of the …, 2023 - openaccess.thecvf.com

Multimodal fusion can make semantic segmentation more robust. However, fusing an
arbitrary number of modalities remains underexplored. To delve into this problem, we create …

Сачувај Цитирај 106 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Conv2former: A simple transformer-style convnet for visual recognition

Q Hou, CZ Lu, MM Cheng… - IEEE transactions on …, 2024 - ieeexplore.ieee.org

Vision Transformers have been the most popular network architecture in visual recognition
recently due to the strong ability of encode global information. However, its high …

Сачувај Цитирај 152 пута наведен Сродни чланци Све верзије (9)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Vision permutator: A permutable mlp-like architecture for visual recognition

Adventures in data analysis: A systematic review of Deep Learning techniques for pattern recognition in cyber-physical-social systems

Are we ready for a new paradigm shift? a survey on visual deep mlp

Scaling & shifting your features: A new baseline for efficient model tuning

Davit: Dual attention vision transformers

Metaformer baselines for vision

Metaformer is actually what you need for vision

Centralized feature pyramid for object detection

Focal modulation networks

Delivering arbitrary-modal semantic segmentation

Conv2former: A simple transformer-style convnet for visual recognition