A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Structured pruning for deep convolutional neural networks: A survey
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …
attributed to their deeper and wider architectures, which can come with significant …
On the opportunities of green computing: A survey
Artificial Intelligence (AI) has achieved significant advancements in technology and research
with the development over several decades, and is widely used in many areas including …
with the development over several decades, and is widely used in many areas including …
[HTML][HTML] Structured pruning adapters
Adapters are a parameter-efficient alternative to fine-tuning, which augment a frozen base
network to learn new tasks. Yet, the inference of the adapted model is often slower than the …
network to learn new tasks. Yet, the inference of the adapted model is often slower than the …
Singe: Sparsity via integrated gradients estimation of neuron relevance
The leap in performance in state-of-the-art computer vision methods is attributed to the
development of deep neural networks. However it often comes at a computational price …
development of deep neural networks. However it often comes at a computational price …
A Multiply-And-Max/min Neuron Paradigm for Aggressively Prunable Deep Neural Networks
The growing interest in the Internet of Things (IoT) and mobile artificial intelligence
applications is pushing the investigation on deep neural networks (DNNs) that can operate …
applications is pushing the investigation on deep neural networks (DNNs) that can operate …
Compressing convolutional neural networks with hierarchical Tucker-2 decomposition
Convolutional neural networks (CNNs) play a crucial role and achieve top results in
computer vision tasks but at the cost of high computational cost and storage complexity. One …
computer vision tasks but at the cost of high computational cost and storage complexity. One …
A survey of lottery ticket hypothesis
The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a
highly sparse subnetwork (ie, winning tickets) that can achieve even better performance …
highly sparse subnetwork (ie, winning tickets) that can achieve even better performance …
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
The sparsely gated mixture of experts (MoE) architecture sends different inputs to different
subnetworks, ie, experts, through trainable routers. MoE reduces the training computation …
subnetworks, ie, experts, through trainable routers. MoE reduces the training computation …
Pruning-and-distillation: One-stage joint compression framework for CNNs via clustering
Network pruning and knowledge distillation, as two effective network compression
techniques, have drawn extensive attention due to their success in reducing model …
techniques, have drawn extensive attention due to their success in reducing model …