A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Pruning and quantization for deep neural network acceleration: A survey
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …
Convolutional neural network pruning with structural redundancy reduction
Z Wang, C Li, X Wang - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Convolutional neural network (CNN) pruning has become one of the most successful
network compression approaches in recent years. Existing works on network pruning …
network compression approaches in recent years. Existing works on network pruning …
Revisiting random channel pruning for neural network compression
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of
neural networks. There has been a flurry of algorithms that try to solve this practical problem …
neural networks. There has been a flurry of algorithms that try to solve this practical problem …
Chip: Channel independence-based pruning for compact neural networks
Filter pruning has been widely used for neural network compression because of its enabled
practical acceleration. To date, most of the existing filter pruning works explore the …
practical acceleration. To date, most of the existing filter pruning works explore the …
Edge intelligence: Empowering intelligence to the edge of network
Edge intelligence refers to a set of connected systems and devices for data collection,
caching, processing, and analysis proximity to where data are captured based on artificial …
caching, processing, and analysis proximity to where data are captured based on artificial …
Resrep: Lossless cnn pruning via decoupling remembering and forgetting
We propose ResRep, a novel method for lossless channel pruning (aka filter pruning), which
slims down a CNN by reducing the width (number of output channels) of convolutional …
slims down a CNN by reducing the width (number of output channels) of convolutional …
Channel pruning via automatic structure search
Channel pruning is among the predominant approaches to compress deep neural networks.
To this end, most existing pruning methods focus on selecting channels (filters) by …
To this end, most existing pruning methods focus on selecting channels (filters) by …