Custom hardware architectures for deep learning on portable devices: A review

KS Zaman, MBI Reaz, SHM Ali… - … on Neural Networks …, 2021 - ieeexplore.ieee.org
The staggering innovations and emergence of numerous deep learning (DL) applications
have forced researchers to reconsider hardware architecture to accommodate fast and …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Training transformers with 4-bit integers

H **, C Li, J Chen, J Zhu - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Quantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural
network training. However, existing 4-bit training methods require custom numerical formats …

Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques

JK Lee, L Mukhanov, AS Molahosseini… - ACM Computing …, 2023 - dl.acm.org
Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, and face recognition, among …

Drawing early-bird tickets: Towards more efficient training of deep networks

H You, C Li, P Xu, Y Fu, Y Wang, X Chen… - arxiv preprint arxiv …, 2019 - arxiv.org
(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …

A survey of on-device machine learning: An algorithms and learning theory perspective

S Dhar, J Guo, J Liu, S Tripathi, U Kurup… - ACM Transactions on …, 2021 - dl.acm.org
The predominant paradigm for using machine learning models on a device is to train a
model in the cloud and perform inference using the trained model on the device. However …

Accurate classification of cherry fruit using deep CNN based on hybrid pooling approach

M Momeny, A Jahanbakhshi, K Jafarnezhad… - Postharvest Biology and …, 2020 - Elsevier
The most important quality parameter of a product is its nutritional value, but marketability of
agricultural products depends primarily on the overall appearance and shape of the …

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org
This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

Shiftaddnet: A hardware-inspired deep network

H You, X Chen, Y Zhang, C Li, S Li… - Advances in …, 2020 - proceedings.neurips.cc
Multiplication (eg, convolution) is arguably a cornerstone of modern deep neural networks
(DNNs). However, intensive multiplications cause expensive resource costs that challenge …

Panther: A programmable architecture for neural network training harnessing energy-efficient reram

A Ankit, I El Hajj, SR Chalamalasetti… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
The wide adoption of deep neural networks has been accompanied by ever-increasing
energy and performance demands due to the expensive nature of training them. Numerous …