Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Efficient hardware architectures for accelerating deep neural networks: Survey

P Dhilleswararao, S Boppu, MS Manikandan… - IEEE …, 2022 - ieeexplore.ieee.org
In the modern-day era of technology, a paradigm shift has been witnessed in the areas
involving applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep …

Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training

E Qin, A Samajdar, H Kwon, V Nadella… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The advent of Deep Learning (DL) has radically transformed the computing industry across
the entire spectrum from algorithms to circuits. As myriad application domains embrace DL, it …

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org
Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org
This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …

Computing graph neural networks: A survey from algorithms to accelerators

S Abadal, A Jain, R Guirado, J López-Alonso… - ACM Computing …, 2021 - dl.acm.org
Graph Neural Networks (GNNs) have exploded onto the machine learning scene in recent
years owing to their capability to model and learn from graph-structured data. Such an ability …

PUMA: A programmable ultra-efficient memristor-based accelerator for machine learning inference

A Ankit, IE Hajj, SR Chalamalasetti, G Ndu… - Proceedings of the …, 2019 - dl.acm.org
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications,
overcoming the fundamental energy efficiency limitations of digital logic. They have been …

Recent advances in convolutional neural network acceleration

Q Zhang, M Zhang, T Chen, Z Sun, Y Ma, B Yu - Neurocomputing, 2019 - Elsevier
In recent years, convolutional neural networks (CNNs) have shown great performance in
various fields such as image classification, pattern recognition, and multi-media …

[หนังสือ][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer
This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

A survey of design and optimization for systolic array-based dnn accelerators

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org
In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …