- Academic Search

H You, H Shi, Y Guo, Y Lin - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Vision Transformers (ViTs) have shown impressive performance and have become
a unified backbone for multiple vision tasks. However, both the attention mechanism and …

Enregistrer Citer Cité 16 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Model quantization and hardware acceleration for vision transformers: A comprehensive survey

D Du, G Gong, X Chu - arxiv preprint arxiv:2405.00314, 2024 - arxiv.org

Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a
promising alternative to convolutional neural networks (CNNs) in several vision-related …

Enregistrer Citer Cité 8 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

A Younesi, M Ansari, M Fazli, A Ejlali, M Shafique… - IEEE …, 2024 - ieeexplore.ieee.org

In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning
(DL), are widely used for various computer vision tasks such as image classification, object …

Enregistrer Citer Cité 36 fois Autres articles Les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficientdm: Efficient quantization-aware fine-tuning of low-bit diffusion models

Y He, J Liu, W Wu, H Zhou, B Zhuang - arxiv preprint arxiv:2310.03270, 2023 - arxiv.org

Diffusion models have demonstrated remarkable capabilities in image synthesis and related
generative tasks. Nevertheless, their practicality for low-latency real-world applications is …

Enregistrer Citer Cité 36 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient multimodal large language models: A survey

Y **, J Li, Y Liu, T Gu, K Wu, Z Jiang, M He… - arxiv preprint arxiv …, 2024 - arxiv.org

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated
remarkable performance in tasks such as visual question answering, visual understanding …

Enregistrer Citer Cité 42 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vit-1.58 b: Mobile vision transformers in the 1-bit era

Z Yuan, R Zhou, H Wang, L He, Y Ye, L Sun - arxiv preprint arxiv …, 2024 - arxiv.org

Vision Transformers (ViTs) have achieved remarkable performance in various image
classification tasks by leveraging the attention mechanism to process image patches as …

Enregistrer Citer Cité 4 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

A General and Efficient Training for Transformer via Token Expansion

W Huang, Y Shen, J **e, B Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

The remarkable performance of Vision Transformers (ViTs) typically requires an extremely
large training cost. Existing methods have attempted to accelerate the training of ViTs yet …

Enregistrer Citer Cité 5 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BiDM: Pushing the Limit of Quantization for Diffusion Models

X Zheng, X Liu, Y Bian, X Ma, Y Zhang, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Diffusion models (DMs) have been significantly developed and widely used in various
applications due to their excellent generative qualities. However, the expensive computation …

Enregistrer Citer Cité 1 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Understanding neural network binarization with forward and backward proximal quantizers

Y Lu, Y Yu, X Li, V Partovi Nia - Advances in Neural …, 2023 - proceedings.neurips.cc

In neural network binarization, BinaryConnect (BC) and its variants are considered the
standard. These methods apply the sign function in their forward pass and their respective …

Enregistrer Citer Cité 2 fois Autres articles Les 6 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

GSB: Group superposition binarization for vision transformer with limited training samples

T Gao, CZ Xu, L Zhang, H Kong - Neural Networks, 2024 - Elsevier

Abstract Vision Transformer (ViT) has performed remarkably in various computer vision
tasks. Nonetheless, affected by the massive amount of parameters, ViT usually suffers from …

Enregistrer Citer Cité 3 fois Autres articles Les 9 versions Free GPT-4 DeepSeek

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Bivit: Extremely compressed binary vision transformers

Shiftaddvit: Mixture of multiplication primitives towards efficient vision transformer

Model quantization and hardware acceleration for vision transformers: A comprehensive survey

A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

Efficientdm: Efficient quantization-aware fine-tuning of low-bit diffusion models

Efficient multimodal large language models: A survey

Vit-1.58 b: Mobile vision transformers in the 1-bit era

A General and Efficient Training for Transformer via Token Expansion

BiDM: Pushing the Limit of Quantization for Diffusion Models

Understanding neural network binarization with forward and backward proximal quantizers

GSB: Group superposition binarization for vision transformer with limited training samples