A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Structured pruning for deep convolutional neural networks: A survey

Y He, L **ao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

GhostNetv2: Enhance cheap operation with long-range attention

Y Tang, K Han, J Guo, C Xu, C Xu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Light-weight convolutional neural networks (CNNs) are specially designed for applications
on mobile devices with faster inference speed. The convolutional operation can only capture …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Distilling object detectors via decoupled features

J Guo, K Han, Y Wang, H Wu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Knowledge distillation is a widely used paradigm for inheriting information from a
complicated teacher network to a compact student network and maintaining the strong …

Chip: Channel independence-based pruning for compact neural networks

Y Sui, M Yin, Y **e, H Phan… - Advances in Neural …, 2021 - proceedings.neurips.cc
Filter pruning has been widely used for neural network compression because of its enabled
practical acceleration. To date, most of the existing filter pruning works explore the …

Patch slimming for efficient vision transformers

Y Tang, K Han, Y Wang, C Xu, J Guo… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper studies the efficiency problem for visual transformers by excavating redundant
calculation in given networks. The recent transformer architecture has demonstrated its …

Learning structured sparsity in deep neural networks

W Wen, C Wu, Y Wang, Y Chen… - Advances in neural …, 2016 - proceedings.neurips.cc
High demand for computation resources severely hinders deployment of large-scale Deep
Neural Networks (DNN) in resource constrained devices. In this work, we propose a …

Only train once: A one-shot neural network training and pruning framework

T Chen, B Ji, T Ding, B Fang, G Wang… - Advances in …, 2021 - proceedings.neurips.cc
Structured pruning is a commonly used technique in deploying deep neural networks
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …

Width & depth pruning for vision transformers

F Yu, K Huang, M Wang, Y Cheng, W Chu… - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Transformer models have demonstrated their promising potential and achieved excellent
performance on a series of computer vision tasks. However, the huge computational cost of …