A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Current progress and open challenges for applying deep learning across the biosciences
Deep Learning (DL) has recently enabled unprecedented advances in one of the grand
challenges in computational biology: the half-century-old problem of protein structure …
challenges in computational biology: the half-century-old problem of protein structure …
H2o: Heavy-hitter oracle for efficient generative inference of large language models
Abstract Large Language Models (LLMs), despite their recent impressive accomplishments,
are notably cost-prohibitive to deploy, particularly for applications involving long-content …
are notably cost-prohibitive to deploy, particularly for applications involving long-content …
Efficientvit: Memory efficient vision transformer with cascaded group attention
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …
However, their remarkable performance is accompanied by heavy computation costs, which …
Deja vu: Contextual sparsity for efficient llms at inference time
Large language models (LLMs) with hundreds of billions of parameters have sparked a new
wave of exciting AI applications. However, they are computationally expensive at inference …
wave of exciting AI applications. However, they are computationally expensive at inference …
A simple and effective pruning approach for large language models
As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …
pruning methods: approaches that drop a subset of network weights while striving to …
Dataset distillation: A comprehensive review
Recent success of deep learning is largely attributed to the sheer amount of data used for
training deep neural networks. Despite the unprecedented success, the massive data …
training deep neural networks. Despite the unprecedented success, the massive data …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Distilling knowledge via knowledge review
Abstract Knowledge distillation transfers knowledge from the teacher network to the student
one, with the goal of greatly improving the performance of the student network. Previous …
one, with the goal of greatly improving the performance of the student network. Previous …
Pruning and quantization for deep neural network acceleration: A survey
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …