Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Machine learning for microcontroller-class hardware: A review

SS Saha, SS Sandha, M Srivastava - IEEE Sensors Journal, 2022 - ieeexplore.ieee.org
The advancements in machine learning (ML) opened a new opportunity to bring intelligence
to the low-end Internet-of-Things (IoT) nodes, such as microcontrollers. Conventional ML …

On-device training under 256kb memory

J Lin, L Zhu, WM Chen, WC Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
On-device training enables the model to adapt to new data collected from the sensors by
fine-tuning a pre-trained model. Users can benefit from customized AI models without having …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org
At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

Machine learning at the network edge: A survey

MGS Murshed, C Murphy, D Hou, N Khan… - ACM Computing …, 2021 - dl.acm.org
Resource-constrained IoT devices, such as sensors and actuators, have become ubiquitous
in recent years. This has led to the generation of large quantities of data in real-time, which …

A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

Efficient processing of deep neural networks: A tutorial and survey

V Sze, YH Chen, TJ Yang, JS Emer - Proceedings of the IEEE, 2017 - ieeexplore.ieee.org
Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI)
applications including computer vision, speech recognition, and robotics. While DNNs …