A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Structured pruning for deep convolutional neural networks: A survey
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …
attributed to their deeper and wider architectures, which can come with significant …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Rethinking attention with performers
We introduce Performers, Transformer architectures which can estimate regular (softmax)
full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to …
full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to …
Pruning neural networks without any data by iteratively conserving synaptic flow
Pruning the parameters of deep neural networks has generated intense interest due to
potential savings in time, memory and energy both during training and at test time. Recent …
potential savings in time, memory and energy both during training and at test time. Recent …
The lottery ticket hypothesis for pre-trained bert networks
In natural language processing (NLP), enormous pre-trained models like BERT have
become the standard starting point for training on a range of downstream tasks, and similar …
become the standard starting point for training on a range of downstream tasks, and similar …
On the effectiveness of parameter-efficient fine-tuning
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range
of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always …
of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always …
Chasing sparsity in vision transformers: An end-to-end exploration
Vision transformers (ViTs) have recently received explosive popularity, but their enormous
model sizes and training costs remain daunting. Conventional post-training pruning often …
model sizes and training costs remain daunting. Conventional post-training pruning often …
A unified lottery ticket hypothesis for graph neural networks
With graphs rapidly growing in size and deeper graph neural networks (GNNs) emerging,
the training and inference of GNNs become increasingly expensive. Existing network weight …
the training and inference of GNNs become increasingly expensive. Existing network weight …
Sparse training via boosting pruning plasticity with neuroregeneration
Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised
a lot of attention currently on post-training pruning (iterative magnitude pruning), and before …
a lot of attention currently on post-training pruning (iterative magnitude pruning), and before …