Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

A comprehensive survey on model quantization for deep neural networks in image classification

B Rokh, A Azarpeyvand, A Khanteymoori - ACM Transactions on …, 2023 - dl.acm.org
Recent advancements in machine learning achieved by Deep Neural Networks (DNNs)
have been significant. While demonstrating high accuracy, DNNs are associated with a …

Merf: Memory-efficient radiance fields for real-time view synthesis in unbounded scenes

C Reiser, R Szeliski, D Verbin, P Srinivasan… - ACM Transactions on …, 2023 - dl.acm.org
Neural radiance fields enable state-of-the-art photorealistic view synthesis. However,
existing radiance field representations are either too compute-intensive for real-time …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-power computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Chasing sparsity in vision transformers: An end-to-end exploration

T Chen, Y Cheng, Z Gan, L Yuan… - Advances in Neural …, 2021 - proceedings.neurips.cc
Vision transformers (ViTs) have recently received explosive popularity, but their enormous
model sizes and training costs remain daunting. Conventional post-training pruning often …

A survey on approximate edge AI for energy efficient autonomous driving services

D Katare, D Perino, J Nurmi, M Warnier… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …

Lingcn: Structural linearized graph convolutional network for homomorphically encrypted inference

H Peng, R Ran, Y Luo, J Zhao… - Advances in …, 2023 - proceedings.neurips.cc
Abstract The growth of Graph Convolution Network (GCN) model sizes has revolutionized
numerous applications, surpassing human performance in areas such as personal …

Autorep: Automatic relu replacement for fast private network inference

H Peng, S Huang, T Zhou, Y Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients'
data privacy and security issues. Private inference (PI) techniques using cryptographic …

Network quantization with element-wise gradient scaling

J Lee, D Kim, B Ham - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Network quantization aims at reducing bit-widths of weights and/or activations, particularly
important for implementing deep neural networks with limited hardware resources. Most …