A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Iterative integration of deep learning in hybrid Earth surface system modelling

M Chen, Z Qian, N Boers, AJ Jakeman… - Nature Reviews Earth & …, 2023 - nature.com
Earth system modelling (ESM) is essential for understanding past, present and future Earth
processes. Deep learning (DL), with the data-driven strength of neural networks, has …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Structured pruning for deep convolutional neural networks: A survey

Y He, L **ao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

Similarity-preserving knowledge distillation

F Tung, G Mori - Proceedings of the IEEE/CVF international …, 2019 - openaccess.thecvf.com
Abstract Knowledge distillation is a widely applicable technique for training a student neural
network under the guidance of a trained teacher network. For example, in neural network …

Pruning vs quantization: Which is better?

A Kuzmin, M Nagel, M Van Baalen… - Advances in neural …, 2023 - proceedings.neurips.cc
Neural network pruning and quantization techniques are almost as old as neural networks
themselves. However, to date, only ad-hoc comparisons between the two have been …

Filter pruning via geometric median for deep convolutional neural networks acceleration

Y He, P Liu, Z Wang, Z Hu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Previous works utilized" smaller-norm-less-important" criterion to prune filters with smaller
norm values in a convolutional neural network. In this paper, we analyze this norm-based …

Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution

Y Chen, H Fan, B Xu, Z Yan… - Proceedings of the …, 2019 - openaccess.thecvf.com
In natural images, information is conveyed at different frequencies where higher frequencies
are usually encoded with fine details and lower frequencies are usually encoded with global …

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

R Gong, X Liu, S Jiang, T Li, P Hu… - Proceedings of the …, 2019 - openaccess.thecvf.com
Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …

Resource orchestration of cloud-edge–based smart grid fault detection

J Li, Y Deng, W Sun, W Li, R Li, Q Li, Z Liu - ACM Transactions on …, 2022 - dl.acm.org
Real-time smart grid monitoring is critical to enhancing resiliency and operational efficiency
of power equipment. Cloud-based and edge-based fault detection systems integrating deep …