- Academic Search

Z Li, H Li, L Meng - Computers, 2023 - mdpi.com

Currently, with the rapid development of deep learning, deep neural networks (DNNs) have
been widely applied in various computer vision tasks. However, in the pursuit of …

Opslaan Citeren Geciteerd door 148 Verwante artikelen Alle 4 versies In cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gptq: Accurate post-training quantization for generative pre-trained transformers

E Frantar, S Ashkboos, T Hoefler, D Alistarh - ar** attention heads do nothing

Y Bondarenko, M Nagel… - Advances in Neural …, 2023 - proceedings.neurips.cc

Transformer models have been widely adopted in various domains over the last years and
especially large language models have advanced the field of AI significantly. Due to their …

Opslaan Citeren Geciteerd door 79 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-power computer …, 2022 - taylorfrancis.com

This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Opslaan Citeren Geciteerd door 1395 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

MJ Rasch, C Mackin, M Le Gallo, A Chen… - Nature …, 2023 - nature.com

Analog in-memory computing—a promising approach for energy-efficient acceleration of
deep learning workloads—computes matrix-vector multiplications but only approximately …

Opslaan Citeren Geciteerd door 85 Verwante artikelen Alle 12 versies

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

A white paper on neural network quantization

Model compression for deep neural networks: A survey

Gptq: Accurate post-training quantization for generative pre-trained transformers

A survey of quantization methods for efficient neural network inference

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators