A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

A comprehensive survey of convolutions in deep learning: Applications, challenges, and future trends

A Younesi, M Ansari, M Fazli, A Ejlali, M Shafique… - IEEE …, 2024 - ieeexplore.ieee.org
In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning
(DL), are widely used for various computer vision tasks such as image classification, object …

Topformer: Token pyramid transformer for mobile semantic segmentation

W Zhang, Z Huang, G Luo, T Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Although vision transformers (ViTs) have achieved great success in computer vision, the
heavy computational cost hampers their applications to dense prediction tasks such as …

Cmt: Convolutional neural networks meet vision transformers

J Guo, K Han, H Wu, Y Tang, X Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Vision transformers have been successfully applied to image recognition tasks due to their
ability to capture long-range dependencies within an image. However, there are still gaps in …

Less is more: Focus attention for efficient detr

D Zheng, W Dong, H Hu, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
DETR-like models have significantly boosted the performance of detectors and even
outperformed classical convolutional models. However, all tokens are treated equally …

One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation

Z Hao, J Guo, K Han, Y Tang, H Hu… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Knowledge distillation (KD) has proven to be a highly effective approach for
enhancing model performance through a teacher-student training scheme. However, most …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Post-training quantization for vision transformer

Z Liu, Y Wang, K Han, W Zhang… - Advances in Neural …, 2021 - proceedings.neurips.cc
Recently, transformer has achieved remarkable performance on a variety of computer vision
applications. Compared with mainstream convolutional neural networks, vision transformers …

Joint token pruning and squeezing towards more aggressive compression of vision transformers

S Wei, T Ye, S Zhang, Y Tang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Although vision transformers (ViTs) have shown promising results in various computer vision
tasks recently, their high computational cost limits their practical applications. Previous …

Flexivit: One model for all patch sizes

L Beyer, P Izmailov, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com
Vision Transformers convert images to sequences by slicing them into patches. The size of
these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher …