A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Structured pruning for deep convolutional neural networks: A survey
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …
attributed to their deeper and wider architectures, which can come with significant …
GhostNetv2: Enhance cheap operation with long-range attention
Light-weight convolutional neural networks (CNNs) are specially designed for applications
on mobile devices with faster inference speed. The convolutional operation can only capture …
on mobile devices with faster inference speed. The convolutional operation can only capture …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Distilling object detectors via decoupled features
Abstract Knowledge distillation is a widely used paradigm for inheriting information from a
complicated teacher network to a compact student network and maintaining the strong …
complicated teacher network to a compact student network and maintaining the strong …
Chip: Channel independence-based pruning for compact neural networks
Filter pruning has been widely used for neural network compression because of its enabled
practical acceleration. To date, most of the existing filter pruning works explore the …
practical acceleration. To date, most of the existing filter pruning works explore the …
Patch slimming for efficient vision transformers
This paper studies the efficiency problem for visual transformers by excavating redundant
calculation in given networks. The recent transformer architecture has demonstrated its …
calculation in given networks. The recent transformer architecture has demonstrated its …
Learning structured sparsity in deep neural networks
High demand for computation resources severely hinders deployment of large-scale Deep
Neural Networks (DNN) in resource constrained devices. In this work, we propose a …
Neural Networks (DNN) in resource constrained devices. In this work, we propose a …
Only train once: A one-shot neural network training and pruning framework
Structured pruning is a commonly used technique in deploying deep neural networks
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …
Width & depth pruning for vision transformers
Transformer models have demonstrated their promising potential and achieved excellent
performance on a series of computer vision tasks. However, the huge computational cost of …
performance on a series of computer vision tasks. However, the huge computational cost of …