A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity
Transformers have quickly shined in the computer vision world since the emergence of
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …
Make sharpness-aware minimization stronger: A sparsified perturbation approach
Deep neural networks often suffer from poor generalization caused by complex and non-
convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization …
convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization …
Outlier weighed layerwise sparsity (owl): A missing secret sauce for pruning llms to high sparsity
Large Language Models (LLMs), renowned for their remarkable performance across diverse
domains, present a challenge when it comes to practical deployment due to their colossal …
domains, present a challenge when it comes to practical deployment due to their colossal …
The emergence of essential sparsity in large pre-trained models: The weights that matter
Large pre-trained transformers are $\textit {show-stealer} $ in modern-day deep learning,
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
Deep neural network fusion via graph matching with applications to model ensemble and federated learning
Abstract Model fusion without accessing training data in machine learning has attracted
increasing interest due to the practical resource-saving and data privacy issues. During the …
increasing interest due to the practical resource-saving and data privacy issues. During the …
Federated dynamic sparse training: Computing less, communicating less, yet learning better
Federated learning (FL) enables distribution of machine learning workloads from the cloud
to resource-limited edge devices. Unfortunately, current deep networks remain not only too …
to resource-limited edge devices. Unfortunately, current deep networks remain not only too …
Learning best combination for efficient n: M sparsity
By forcing N out of M consecutive weights to be non-zero, the recent N: M fine-grained
network sparsity has received increasing attention with its two attractive advantages over …
network sparsity has received increasing attention with its two attractive advantages over …
DominoSearch: Find layer-wise fine-grained N: M sparse schemes from dense neural networks
Neural pruning is a widely-used compression technique for Deep Neural Networks (DNNs).
Recent innovations in Hardware Architectures (eg Nvidia Ampere Sparse Tensor Core) and …
Recent innovations in Hardware Architectures (eg Nvidia Ampere Sparse Tensor Core) and …
Deep model fusion: A survey
Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …
predictions of multiple deep learning models into a single one. It combines the abilities of …