A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Edge-cloud polarization and collaboration: A comprehensive survey for ai
Influenced by the great success of deep learning via cloud computing and the rapid
development of edge chips, research in artificial intelligence (AI) has shifted to both of the …
development of edge chips, research in artificial intelligence (AI) has shifted to both of the …
Computational complexity evaluation of neural network applications in signal processing
P Freire, S Srivallapanondh, A Napoli… - ar** quantization for extremely low-bit post-training quantization
Recently, post-training quantization (PTQ) has driven much attention to produce efficient
neural networks without long-time retraining. Despite its low cost, current PTQ works tend to …
neural networks without long-time retraining. Despite its low cost, current PTQ works tend to …
I-vit: Integer-only quantization for efficient vision transformer inference
Abstract Vision Transformers (ViTs) have achieved state-of-the-art performance on various
computer vision applications. However, these models have considerable storage and …
computer vision applications. However, these models have considerable storage and …