Google Acadèmic

Y Zniyed, TP Nguyen - Neural Networks, 2024 - Elsevier

In this paper, we present CORING, which is short for effiCient tensOr decomposition-based
filteR prunING, a novel filter pruning methodology for neural networks. CORING is crafted to …

Desa Cita Citat per 29 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] preprints.org

[PDF][PDF] Plug-and-play: An efficient post-training pruning method for large language models

Y Zhang, H Bai, H Lin, J Zhao, L Hou… - The Twelfth …, 2024 - preprints.org

With the rapid growth of large language models (LLMs), there is increasing demand for
memory and computation in LLMs. Recent efforts on post-training pruning of LLMs aim to …

Desa Cita Citat per 37 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Discovering sparsity allocation for layer-wise pruning of large language models

L Li, P Dong, Z Tang, X Liu, Q Wang… - Advances in …, 2025 - proceedings.neurips.cc

In this paper, we present DSA, the first automated framework for discovering sparsity
allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have …

Desa Cita Citat per 3 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Besa: Pruning large language models with blockwise parameter-efficient sparsity allocation

P Xu, W Shao, M Chen, S Tang, K Zhang, P Gao… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated outstanding performance in various
tasks, such as text summarization, text question-answering, and etc. While their performance …

Desa Cita Citat per 24 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Maskllm: Learnable semi-structured sparsity for large language models

G Fang, H Yin, S Muralidharan, G Heinrich… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) are distinguished by their massive parameter counts, which
typically result in significant redundancy. This work introduces MaskLLM, a learnable …

Desa Cita Citat per 10 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Automatic network pruning via hilbert-schmidt independence criterion lasso under information bottleneck principle

S Guo, L Zhang, X Zheng, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Most existing neural network pruning methods hand-crafted their importance criteria and
structures to prune. This constructs heavy and unintended dependencies on heuristics and …

Desa Cita Citat per 14 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

X Luo, D Liu, H Kong, S Huai, H Chen… - ACM Transactions on …, 2024 - dl.acm.org

Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …

Desa Cita Citat per 1 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek

Towards performance-maximizing neural network pruning via global channel attention

Y Wang, S Guo, J Guo, J Zhang, W Zhang, C Yan… - Neural Networks, 2024 - Elsevier

Network pruning has attracted increasing attention recently for its capability of transferring
large-scale neural networks (eg, CNNs) into resource-constrained devices. Such a transfer …

Desa Cita Citat per 15 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment

W Li, L Li, M Lee, S Sun - Advances in Neural Information …, 2025 - proceedings.neurips.cc

Abstract Large Language Models (LLMs) have revolutionized the field of natural language
processing with their impressive capabilities. However, their enormous size presents …

Desa Cita Citat per 3 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

ELSA: Exploiting layer-wise n: m sparsity for vision transformer acceleration

NC Huang, CC Chang, WC Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com

N: M sparsity is an emerging model compression method supported by more and more
accelerators to speed up sparse matrix multiplication in deep neural networks. Most existing …

Desa Cita Citat per 2 Articles relacionats Totes les 8 versions Free GPT-4 DeepSeek Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Learning best combination for efficient n: M sparsity

Efficient tensor decomposition-based filter pruning

[PDF][PDF] Plug-and-play: An efficient post-training pruning method for large language models

Discovering sparsity allocation for layer-wise pruning of large language models

Besa: Pruning large language models with blockwise parameter-efficient sparsity allocation

Maskllm: Learnable semi-structured sparsity for large language models

Automatic network pruning via hilbert-schmidt independence criterion lasso under information bottleneck principle

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

Towards performance-maximizing neural network pruning via global channel attention

Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment

ELSA: Exploiting layer-wise n: m sparsity for vision transformer acceleration