Lightweight Deep Learning for Resource-Constrained Environments: A Survey

HI Liu, M Galindo, H **e, LK Wong, HH Shuai… - ACM Computing …, 2024 - dl.acm.org
Over the past decade, the dominance of deep learning has prevailed across various
domains of artificial intelligence, including natural language processing, computer vision …

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

X Luo, D Liu, H Kong, S Huai, H Chen… - ACM Transactions on …, 2024 - dl.acm.org
Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …

A 95.6-TOPS/W deep learning inference accelerator with per-vector scaled 4-bit quantization in 5 nm

B Keller, R Venkatesan, S Dai, SG Tell… - IEEE Journal of Solid …, 2023 - ieeexplore.ieee.org
The energy efficiency of deep neural network (DNN) inference can be improved with custom
accelerators. DNN inference accelerators often employ specialized hardware techniques to …

Demystifying bert: System design implications

S Pati, S Aga, N Jayasena… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Transfer learning in natural language processing (NLP) uses increasingly large models that
tackle challenging problems. Consequently, these applications are driving the requirements …

NIPQ: Noise proxy-based integrated pseudo-quantization

J Shin, J So, S Park, S Kang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Straight-through estimator (STE), which enables the gradient flow over the non-
differentiable function via approximation, has been favored in studies related to quantization …

Unit scaling: Out-of-the-box low-precision training

C Blake, D Orr, C Luschi - International Conference on …, 2023 - proceedings.mlr.press
We present unit scaling, a paradigm for designing deep learning models that simplifies the
use of low-precision number formats. Training in FP16 or the recently proposed FP8 formats …

[HTML][HTML] Assessing the influence of sensor-induced noise on machine-learning-based changeover detection in CNC machines

VG Biju, AM Schmitt, B Engelmann - Sensors, 2024 - mdpi.com
The noise in sensor data has a substantial impact on the reliability and accuracy of (ML)
algorithms. A comprehensive framework is proposed to analyze the effects of diverse noise …

2-bit conformer quantization for automatic speech recognition

O Rybakov, P Meadowlark, S Ding, D Qiu, J Li… - arxiv preprint arxiv …, 2023 - arxiv.org
Large speech models are rapidly gaining traction in research community. As a result, model
compression has become an important topic, so that these models can fit in memory and be …

Powerquant: Automorphism search for non-uniform quantization

E Yvinec, A Dapogny, M Cord, K Bailly - arxiv preprint arxiv:2301.09858, 2023 - arxiv.org
Deep neural networks (DNNs) are nowadays ubiquitous in many domains such as computer
vision. However, due to their high latency, the deployment of DNNs hinges on the …

Bitdistiller: Unleashing the potential of sub-4-bit llms via self-distillation

D Du, Y Zhang, S Cao, J Guo, T Cao, X Chu… - arxiv preprint arxiv …, 2024 - arxiv.org
The upscaling of Large Language Models (LLMs) has yielded impressive advances in
natural language processing, yet it also poses significant deployment challenges. Weight …