Google Tudós

Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications

A Wang, H Chen, Z Lin, J Han… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Recently lightweight Vision Transformers (ViTs) demonstrate superior performance
and lower latency compared with lightweight Convolutional Neural Networks (CNNs) on …

Mentés Hivatkozás Idézetek száma: 193 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]

[PDF] neurips.cc

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc

Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …

Mentés Hivatkozás Idézetek száma: 116 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Mobileclip: Fast image-text models through multi-modal reinforced training

PKA Vasu, H Pouransari, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com

Contrastive pre-training of image-text foundation models such as CLIP demonstrated
excellent zero-shot performance and improved robustness on a wide range of downstream …

Mentés Hivatkozás Idézetek száma: 29 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Pem: Prototype-based efficient maskformer for image segmentation

N Cavagnero, G Rosi, C Cuttano… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent transformer-based architectures have shown impressive results in the field of image
segmentation. Thanks to their flexibility they obtain outstanding performance in multiple …

Mentés Hivatkozás Idézetek száma: 18 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Shvit: Single-head vision transformer with memory efficient macro design

S Yun, Y Ro - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Abstract Recently efficient Vision Transformers have shown great performance with low
latency on resource-constrained devices. Conventionally they use 4x4 patch embeddings …

Mentés Hivatkozás Idézetek száma: 20 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

Optimizing underwater image enhancement: integrating semi-supervised learning and multi-scale aggregated attention

S Xu, J Wang, N He, G Xu, G Zhang - The Visual Computer, 2024 - Springer

Underwater image enhancement is critical for advancing marine science and underwater
engineering. Traditional methods often struggle with color distortion, low contrast, and …

Mentés Hivatkozás Idézetek száma: 1 Kapcsolódó cikkek

[Free GPT-4]

[PDF] arxiv.org

Cas-vit: Convolutional additive self-attention vision transformers for efficient mobile applications

T Zhang, L Li, Y Zhou, W Liu, C Qian, X Ji - arxiv preprint arxiv …, 2024 - arxiv.org

Vision Transformers (ViTs) mark a revolutionary advance in neural networks with their token
mixer's powerful global context capability. However, the pairwise token affinity and complex …

Mentés Hivatkozás Idézetek száma: 6 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

HF-HRNet: a simple hardware friendly high-resolution network

H Zhang, Y Dun, Y Pei, S Lai, C Liu… - … on Circuits and …, 2024 - ieeexplore.ieee.org

High-resolution networks have made significant progress in dense prediction tasks such as
human pose estimation and semantic segmentation. To better explore this high-resolution …

Mentés Hivatkozás Idézetek száma: 10 Kapcsolódó cikkek

Swiftdepth: An efficient hybrid cnn-transformer model for self-supervised monocular depth estimation on mobile devices

A Luginov, I Makarov - 2023 IEEE International Symposium on …, 2023 - ieeexplore.ieee.org

Self-supervised Monocular Depth Estimation (MDE) models trained solely on single-camera
video have gained significant popularity. Recent studies have shown that Vision …

Mentés Hivatkozás Idézetek száma: 19 Kapcsolódó cikkek Mind a(z) 2 változat

[Free GPT-4]

[PDF] pkwyx.com

Efficient Vision Transformers with Partial Attention

XT Vo, DL Nguyen, A Priadana, KH Jo - European Conference on …, 2024 - Springer

As a core of Vision Transformer (ViT), self-attention has high versatility in modeling long-
range spatial interactions because every query attends to all spatial locations. Although ViT …

Mentés Hivatkozás Idézetek száma: 1 Kapcsolódó cikkek Mind a(z) 4 változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications

Repvit: Revisiting mobile cnn from vit perspective

Spike-driven transformer

Mobileclip: Fast image-text models through multi-modal reinforced training

Pem: Prototype-based efficient maskformer for image segmentation

Shvit: Single-head vision transformer with memory efficient macro design

Optimizing underwater image enhancement: integrating semi-supervised learning and multi-scale aggregated attention

Cas-vit: Convolutional additive self-attention vision transformers for efficient mobile applications

HF-HRNet: a simple hardware friendly high-resolution network

Swiftdepth: An efficient hybrid cnn-transformer model for self-supervised monocular depth estimation on mobile devices

Efficient Vision Transformers with Partial Attention