Structured pruning for deep convolutional neural networks: A survey

Y He, L **ao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

Chex: Channel exploration for cnn model compression

Z Hou, M Qin, F Sun, X Ma, K Yuan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Channel pruning has been broadly recognized as an effective technique to reduce the
computation and memory cost of deep convolutional neural networks. However …

Advancing model pruning via bi-level optimization

Y Zhang, Y Yao, P Ram, P Zhao… - Advances in …, 2022 - proceedings.neurips.cc
The deployment constraints in practical applications necessitate the pruning of large-scale
deep learning models, ie, promoting their weight sparsity. As illustrated by the Lottery Ticket …

Structural pruning via latency-saliency knapsack

M Shen, H Yin, P Molchanov, L Mao… - Advances in Neural …, 2022 - proceedings.neurips.cc
Structural pruning can simplify network architecture and improve inference speed. We
propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a …

Automatic network pruning via hilbert-schmidt independence criterion lasso under information bottleneck principle

S Guo, L Zhang, X Zheng, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Most existing neural network pruning methods hand-crafted their importance criteria and
structures to prune. This constructs heavy and unintended dependencies on heuristics and …

Pruning-as-search: Efficient neural architecture search via channel pruning and structural reparameterization

Y Li, P Zhao, G Yuan, X Lin, Y Wang… - arxiv preprint arxiv …, 2022 - arxiv.org
Neural architecture search (NAS) and network pruning are widely studied efficient AI
techniques, but not yet perfect. NAS performs exhaustive candidate architecture search …

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

X Luo, D Liu, H Kong, S Huai, H Chen… - ACM Transactions on …, 2024 - dl.acm.org
Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …

Quantformer: Learning extremely low-precision vision transformers

Z Wang, C Wang, X Xu, J Zhou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
In this article, we propose extremely low-precision vision transformers called Quantformer for
efficient inference. Conventional network quantization methods directly quantize weights …

Learning pruning-friendly networks via frank-wolfe: One-shot, any-sparsity, and no retraining

M Lu, X Luo, T Chen, W Chen, D Liu… - … Conference on Learning …, 2022 - openreview.net
We present a novel framework to train a large deep neural network (DNN) for only $\textit
{once} $, which can then be pruned to $\textit {any sparsity ratio} $ to preserve competitive …

Quarantine: Sparsity can uncover the trojan attack trigger for free

T Chen, Z Zhang, Y Zhang, S Chang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally
on most samples, yet to produce manipulated results for inputs attached with a particular …