A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Convolutional neural network pruning with structural redundancy reduction

Z Wang, C Li, X Wang - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Convolutional neural network (CNN) pruning has become one of the most successful
network compression approaches in recent years. Existing works on network pruning …

Revisiting random channel pruning for neural network compression

Y Li, K Adamczewski, W Li, S Gu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of
neural networks. There has been a flurry of algorithms that try to solve this practical problem …

Chip: Channel independence-based pruning for compact neural networks

Y Sui, M Yin, Y **e, H Phan… - Advances in Neural …, 2021 - proceedings.neurips.cc
Filter pruning has been widely used for neural network compression because of its enabled
practical acceleration. To date, most of the existing filter pruning works explore the …

Edge intelligence: Empowering intelligence to the edge of network

D Xu, T Li, Y Li, X Su, S Tarkoma, T Jiang… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Edge intelligence refers to a set of connected systems and devices for data collection,
caching, processing, and analysis proximity to where data are captured based on artificial …

Resrep: Lossless cnn pruning via decoupling remembering and forgetting

X Ding, T Hao, J Tan, J Liu, J Han… - Proceedings of the …, 2021 - openaccess.thecvf.com
We propose ResRep, a novel method for lossless channel pruning (aka filter pruning), which
slims down a CNN by reducing the width (number of output channels) of convolutional …

Channel pruning via automatic structure search

M Lin, R Ji, Y Zhang, B Zhang, Y Wu, Y Tian - arxiv preprint arxiv …, 2020 - arxiv.org
Channel pruning is among the predominant approaches to compress deep neural networks.
To this end, most existing pruning methods focus on selecting channels (filters) by …