Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Computational complexity evaluation of neural network applications in signal processing
In this paper, we provide a systematic approach for assessing and comparing the
computational complexity of neural network layers in digital signal processing. We provide …
computational complexity of neural network layers in digital signal processing. We provide …
Pruning and quantization for deep neural network acceleration: A survey
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …
{BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning
Cross-silo federated learning (FL) enables organizations (eg, financial, or medical) to
collaboratively train a machine learning model by aggregating local gradient updates from …
collaboratively train a machine learning model by aggregating local gradient updates from …
Differentiable soft quantization: Bridging full-precision and low-bit neural networks
Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …
accelerate the inference and meanwhile reduce memory consumption of the deep neural …
Integer quantization for deep learning inference: Principles and empirical evaluation
Quantization techniques can reduce the size of Deep Neural Networks and improve
inference latency and throughput by taking advantage of high throughput integer …
inference latency and throughput by taking advantage of high throughput integer …
Adabin: Improving binary neural networks with adaptive binary sets
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are
both binarized into 1-bit values, thus greatly reducing the memory usage and computational …
both binarized into 1-bit values, thus greatly reducing the memory usage and computational …
Post-training piecewise linear quantization for deep neural networks
Quantization plays an important role in the energy-efficient deployment of deep neural
networks on resource-limited devices. Post-training quantization is highly desirable since it …
networks on resource-limited devices. Post-training quantization is highly desirable since it …
Artificial neural networks for photonic applications—from algorithms to implementation: tutorial
This tutorial–review on applications of artificial neural networks in photonics targets a broad
audience, ranging from optical research and engineering communities to computer science …
audience, ranging from optical research and engineering communities to computer science …