A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Iterative integration of deep learning in hybrid Earth surface system modelling
Earth system modelling (ESM) is essential for understanding past, present and future Earth
processes. Deep learning (DL), with the data-driven strength of neural networks, has …
processes. Deep learning (DL), with the data-driven strength of neural networks, has …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Structured pruning for deep convolutional neural networks: A survey
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …
attributed to their deeper and wider architectures, which can come with significant …
Similarity-preserving knowledge distillation
Abstract Knowledge distillation is a widely applicable technique for training a student neural
network under the guidance of a trained teacher network. For example, in neural network …
network under the guidance of a trained teacher network. For example, in neural network …
Pruning vs quantization: Which is better?
Neural network pruning and quantization techniques are almost as old as neural networks
themselves. However, to date, only ad-hoc comparisons between the two have been …
themselves. However, to date, only ad-hoc comparisons between the two have been …
Filter pruning via geometric median for deep convolutional neural networks acceleration
Previous works utilized" smaller-norm-less-important" criterion to prune filters with smaller
norm values in a convolutional neural network. In this paper, we analyze this norm-based …
norm values in a convolutional neural network. In this paper, we analyze this norm-based …
Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution
In natural images, information is conveyed at different frequencies where higher frequencies
are usually encoded with fine details and lower frequencies are usually encoded with global …
are usually encoded with fine details and lower frequencies are usually encoded with global …
Differentiable soft quantization: Bridging full-precision and low-bit neural networks
Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …
accelerate the inference and meanwhile reduce memory consumption of the deep neural …
Resource orchestration of cloud-edge–based smart grid fault detection
Real-time smart grid monitoring is critical to enhancing resiliency and operational efficiency
of power equipment. Cloud-based and edge-based fault detection systems integrating deep …
of power equipment. Cloud-based and edge-based fault detection systems integrating deep …