- Academic Search

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y ** activation for quantized neural networks

J Choi, Z Wang, S Venkataramani, PIJ Chuang… - arxiv preprint arxiv …, 2018 - arxiv.org

Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. To address this cost, a number of quantization schemes have been …

Save Cite Cited by 1125 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

R Gong, X Liu, S Jiang, T Li, P Hu… - Proceedings of the …, 2019 - openaccess.thecvf.com

Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …

Save Cite Cited by 559 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Learning to quantize deep networks by optimizing quantization intervals with task loss

S Jung, C Son, S Lee, J Son, JJ Han… - Proceedings of the …, 2019 - openaccess.thecvf.com

Reducing bit-widths of activations and weights of deep networks makes it efficient to
compute and store them in memory, which is crucial in their deployments to resource-limited …

Save Cite Cited by 462 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlsys.org

Accurate and efficient 2-bit quantized neural networks

J Choi, S Venkataramani… - Proceedings of …, 2019 - proceedings.mlsys.org

Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. In order to reduce this cost, several quantization schemes have gained …

Save Cite Cited by 212 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org Full View

Compression of deep learning models for text: A survey

M Gupta, P Agrawal - ACM Transactions on Knowledge Discovery from …, 2022 - dl.acm.org

In recent years, the fields of natural language processing (NLP) and information retrieval (IR)
have made tremendous progress thanks to deep learning models like Recurrent Neural …

Save Cite Cited by 127 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Adabits: Neural network quantization with adaptive bit-widths

Q **, L Yang, Z Liao - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com

Deep neural networks with adaptive configurations have gained increasing attention due to
the instant and flexible deployment of these models on platforms with different resource …

Save Cite Cited by 152 Related articles All 7 versions Free GPT-4 View as HTML

Energy-efficient neural network accelerator based on outlier-aware low-precision computation

E Park, D Kim, S Yoo - 2018 ACM/IEEE 45th Annual …, 2018 - ieeexplore.ieee.org

Owing to the presence of large values, which we call outliers, conventional methods of
quantization fail to achieve significantly low precision, eg, four bits, for very deep neural …

Save Cite Cited by 219 Related articles All 3 versions Free GPT-4

Cite

Advanced search

Saved to My library

Model compression and hardware acceleration for neural networks: A comprehensive survey

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

Learning to quantize deep networks by optimizing quantization intervals with task loss

Accurate and efficient 2-bit quantized neural networks

Compression of deep learning models for text: A survey

Adabits: Neural network quantization with adaptive bit-widths

Energy-efficient neural network accelerator based on outlier-aware low-precision computation