Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
An overview of neural network compression
JO Neill - arxiv preprint arxiv:2006.03669, 2020 - arxiv.org
Overparameterized networks trained to convergence have shown impressive performance
in domains such as computer vision and natural language processing. Pushing state of the …
in domains such as computer vision and natural language processing. Pushing state of the …
Pruning and quantization for deep neural network acceleration: A survey
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …
Training deep neural networks with 8-bit floating point numbers
The state-of-the-art hardware platforms for training deep neural networks are moving from
traditional single precision (32-bit) computations towards 16 bits of precision-in large part …
traditional single precision (32-bit) computations towards 16 bits of precision-in large part …
A study of BFLOAT16 for deep learning training
This paper presents the first comprehensive empirical study demonstrating the efficacy of the
Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across …
Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across …
Floatpim: In-memory acceleration of deep neural network training with high precision
Processing In-Memory (PIM) has shown a great potential to accelerate inference tasks of
Convolutional Neural Network (CNN). However, existing PIM architectures do not support …
Convolutional Neural Network (CNN). However, existing PIM architectures do not support …
Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
Although the quest for more accurate solutions is pushing deep learning research towards
larger and more complex algorithms, edge devices demand efficient inference and therefore …
larger and more complex algorithms, edge devices demand efficient inference and therefore …
Towards unified int8 training for convolutional neural network
Abstract Recently low-bit (eg, 8-bit) network quantization has been extensively studied to
accelerate the inference. Besides inference, low-bit training with quantized gradients can …
accelerate the inference. Besides inference, low-bit training with quantized gradients can …
A Neural-Network-Based Model Predictive Control of Three-Phase Inverter With an Output Filter
Model predictive control (MPC) has become one of the well-established modern control
methods for three-phase inverters with an output LC filter, where a high-quality voltage with …
methods for three-phase inverters with an output LC filter, where a high-quality voltage with …
Use of neural networks for stable, accurate and physically consistent parameterization of subgrid atmospheric processes with good performance at reduced precision
A promising approach to improve climate‐model simulations is to replace traditional subgrid
parameterizations based on simplified physical models by machine learning algorithms that …
parameterizations based on simplified physical models by machine learning algorithms that …