Google Académico

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Guardar Citar Citado por 133 Artículos relacionados Las 2 versiones

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient deep learning: A survey on making deep learning models smaller, faster, and better

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org

Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

Guardar Citar Citado por 451 Artículos relacionados Las 6 versiones

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Run, don't walk: chasing higher FLOPS for faster neural networks

J Chen, S Kao, H He, W Zhuo, S Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com

To design fast neural networks, many works have been focusing on reducing the number of
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …

Guardar Citar Citado por 1210 Artículos relacionados Las 10 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

H2o: Heavy-hitter oracle for efficient generative inference of large language models

Z Zhang, Y Sheng, T Zhou, T Chen… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Large Language Models (LLMs), despite their recent impressive accomplishments,
are notably cost-prohibitive to deploy, particularly for applications involving long-content …

Guardar Citar Citado por 282 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc

Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

Guardar Citar Citado por 205 Artículos relacionados Las 11 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Deja vu: Contextual sparsity for efficient llms at inference time

Z Liu, J Wang, T Dao, T Zhou, B Yuan… - International …, 2023 - proceedings.mlr.press

Large language models (LLMs) with hundreds of billions of parameters have sparked a new
wave of exciting AI applications. However, they are computationally expensive at inference …

Guardar Citar Citado por 263 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc

Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …

Guardar Citar Citado por 121 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arxiv preprint arxiv:2306.11695, 2023 - arxiv.org

As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

Guardar Citar Citado por 457 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Guardar Citar Citado por 878 Artículos relacionados Las 27 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier

Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Guardar Citar Citado por 866 Artículos relacionados Las 6 versiones

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

Efficient deep learning: A survey on making deep learning models smaller, faster, and better

Run, don't walk: chasing higher FLOPS for faster neural networks

H2o: Heavy-hitter oracle for efficient generative inference of large language models

Patch diffusion: Faster and more data-efficient training of diffusion models

Deja vu: Contextual sparsity for efficient llms at inference time

Spike-driven transformer

A simple and effective pruning approach for large language models

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

Pruning and quantization for deep neural network acceleration: A survey