Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Frugalgpt: How to use large language models while reducing cost and improving performance

L Chen, M Zaharia, J Zou - arxiv preprint arxiv:2305.05176, 2023 - arxiv.org
There is a rapidly growing number of large language models (LLMs) that users can query for
a fee. We review the cost associated with querying popular LLM APIs, eg GPT-4, ChatGPT …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-power computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Binary neural networks: A survey

H Qin, R Gong, X Liu, X Bai, J Song, N Sebe - Pattern Recognition, 2020 - Elsevier
The binary neural network, largely saving the storage and computation, serves as a
promising technique for deploying deep models on resource-limited devices. However, the …

A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

How important are activation functions in regression and classification? A survey, performance comparison, and future directions

AD Jagtap, GE Karniadakis - Journal of Machine Learning for …, 2023 - dl.begellhouse.com
Inspired by biological neurons, the activation functions play an essential part in the learning
process of any artificial neural network (ANN) commonly used in many real-world problems …

Learned step size quantization

SK Esser, JL McKinstry, D Bablani… - arxiv preprint arxiv …, 2019 - arxiv.org
Deep networks run with low precision operations at inference time offer power and space
advantages over high precision alternatives, but need to overcome the challenge of …

Reactnet: Towards precise binary neural network with generalized activation functions

Z Liu, Z Shen, M Savvides, KT Cheng - … Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
In this paper, we propose several ideas for enhancing a binary network to close its accuracy
gap from real-valued networks without incurring any additional computational cost. We first …