A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity

S Liu, T Chen, X Chen, X Chen, Q **ao, B Wu… - arxiv preprint arxiv …, 2022 - arxiv.org
Transformers have quickly shined in the computer vision world since the emergence of
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …

Make sharpness-aware minimization stronger: A sparsified perturbation approach

P Mi, L Shen, T Ren, Y Zhou, X Sun… - Advances in Neural …, 2022 - proceedings.neurips.cc
Deep neural networks often suffer from poor generalization caused by complex and non-
convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization …

Outlier weighed layerwise sparsity (owl): A missing secret sauce for pruning llms to high sparsity

L Yin, Y Wu, Z Zhang, CY Hsieh, Y Wang, Y Jia… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs), renowned for their remarkable performance across diverse
domains, present a challenge when it comes to practical deployment due to their colossal …

The emergence of essential sparsity in large pre-trained models: The weights that matter

A Jaiswal, S Liu, T Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Large pre-trained transformers are $\textit {show-stealer} $ in modern-day deep learning,
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …

Deep neural network fusion via graph matching with applications to model ensemble and federated learning

C Liu, C Lou, R Wang, AY **… - … on Machine Learning, 2022 - proceedings.mlr.press
Abstract Model fusion without accessing training data in machine learning has attracted
increasing interest due to the practical resource-saving and data privacy issues. During the …

Federated dynamic sparse training: Computing less, communicating less, yet learning better

S Bibikar, H Vikalo, Z Wang, X Chen - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Federated learning (FL) enables distribution of machine learning workloads from the cloud
to resource-limited edge devices. Unfortunately, current deep networks remain not only too …

Learning best combination for efficient n: M sparsity

Y Zhang, M Lin, Z Lin, Y Luo, K Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
By forcing N out of M consecutive weights to be non-zero, the recent N: M fine-grained
network sparsity has received increasing attention with its two attractive advantages over …

DominoSearch: Find layer-wise fine-grained N: M sparse schemes from dense neural networks

W Sun, A Zhou, S Stuijk, R Wijnhoven… - Advances in neural …, 2021 - proceedings.neurips.cc
Neural pruning is a widely-used compression technique for Deep Neural Networks (DNNs).
Recent innovations in Hardware Architectures (eg Nvidia Ampere Sparse Tensor Core) and …

Deep model fusion: A survey

W Li, Y Peng, M Zhang, L Ding, H Hu… - arxiv preprint arxiv …, 2023 - arxiv.org
Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …