[HTML][HTML] A review of binarized neural networks

T Simons, DJ Lee - Electronics, 2019 - mdpi.com
In this work, we review Binarized Neural Networks (BNNs). BNNs are deep neural networks
that use binary values for activations and weights, instead of full precision values. With …

Lightweight deep learning: An overview

CH Wang, KY Huang, Y Yao, JC Chen… - IEEE consumer …, 2022 - ieeexplore.ieee.org
With the recent success of the deep neural networks (DNNs) in the field of artificial
intelligence, the urge of deploying DNNs has drawn tremendous attention because it can …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-power computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Binary neural networks: A survey

H Qin, R Gong, X Liu, X Bai, J Song, N Sebe - Pattern Recognition, 2020 - Elsevier
The binary neural network, largely saving the storage and computation, serves as a
promising technique for deploying deep models on resource-limited devices. However, the …

Similarity-preserving knowledge distillation

F Tung, G Mori - Proceedings of the IEEE/CVF international …, 2019 - openaccess.thecvf.com
Abstract Knowledge distillation is a widely applicable technique for training a student neural
network under the guidance of a trained teacher network. For example, in neural network …

Billm: Pushing the limit of post-training quantization for llms

W Huang, Y Liu, H Qin, Y Li, S Zhang, X Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Pretrained large language models (LLMs) exhibit exceptional general language processing
capabilities but come with significant demands on memory and computational resources. As …

Reactnet: Towards precise binary neural network with generalized activation functions

Z Liu, Z Shen, M Savvides, KT Cheng - … Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
In this paper, we propose several ideas for enhancing a binary network to close its accuracy
gap from real-valued networks without incurring any additional computational cost. We first …

Low-bit quantization of neural networks for efficient inference

Y Choukroun, E Kravchik, F Yang… - 2019 IEEE/CVF …, 2019 - ieeexplore.ieee.org
Recent machine learning methods use increasingly large deep neural networks to achieve
state of the art results in various tasks. The gains in performance come at the cost of a …

Learning to quantize deep networks by optimizing quantization intervals with task loss

S Jung, C Son, S Lee, J Son, JJ Han… - Proceedings of the …, 2019 - openaccess.thecvf.com
Reducing bit-widths of activations and weights of deep networks makes it efficient to
compute and store them in memory, which is crucial in their deployments to resource-limited …