Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Computational complexity evaluation of neural network applications in signal processing

P Freire, S Srivallapanondh, A Napoli… - arxiv preprint arxiv …, 2022 - arxiv.org
In this paper, we provide a systematic approach for assessing and comparing the
computational complexity of neural network layers in digital signal processing. We provide …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

{BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning

C Zhang, S Li, J **a, W Wang, F Yan, Y Liu - 2020 USENIX annual …, 2020 - usenix.org
Cross-silo federated learning (FL) enables organizations (eg, financial, or medical) to
collaboratively train a machine learning model by aggregating local gradient updates from …

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

R Gong, X Liu, S Jiang, T Li, P Hu… - Proceedings of the …, 2019 - openaccess.thecvf.com
Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …

Integer quantization for deep learning inference: Principles and empirical evaluation

H Wu, P Judd, X Zhang, M Isaev… - arxiv preprint arxiv …, 2020 - arxiv.org
Quantization techniques can reduce the size of Deep Neural Networks and improve
inference latency and throughput by taking advantage of high throughput integer …

Adabin: Improving binary neural networks with adaptive binary sets

Z Tu, X Chen, P Ren, Y Wang - European conference on computer vision, 2022 - Springer
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are
both binarized into 1-bit values, thus greatly reducing the memory usage and computational …

Post-training piecewise linear quantization for deep neural networks

J Fang, A Shafiee, H Abdel-Aziz, D Thorsley… - Computer Vision–ECCV …, 2020 - Springer
Quantization plays an important role in the energy-efficient deployment of deep neural
networks on resource-limited devices. Post-training quantization is highly desirable since it …

Artificial neural networks for photonic applications—from algorithms to implementation: tutorial

P Freire, E Manuylovich, JE Prilepsky… - Advances in Optics and …, 2023 - opg.optica.org
This tutorial–review on applications of artificial neural networks in photonics targets a broad
audience, ranging from optical research and engineering communities to computer science …