A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Efficientvit: Memory efficient vision transformer with cascaded group attention

X Liu, H Peng, N Zheng, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arxiv preprint arxiv:2306.11695, 2023 - arxiv.org
As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning

Q Zhang, M Chen, A Bukharin… - arxiv preprint arxiv …, 2023 - arxiv.org
Fine-tuning large pre-trained language models on downstream tasks has become an
important paradigm in NLP. However, common practice fine-tunes all of the parameters in a …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Structured pruning learns compact and accurate models

M **a, Z Zhong, D Chen - arxiv preprint arxiv:2204.00408, 2022 - arxiv.org
The growing size of neural language models has led to increased attention in model
compression. The two predominant approaches are pruning, which gradually removes …

Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments

D Wu, S Lv, M Jiang, H Song - Computers and Electronics in Agriculture, 2020 - Elsevier
Achieving the rapid and accurate detection of apple flowers in natural environments is
essential for yield estimation and the development of an automatic flower thinner. A real-time …