Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks
Deep Neural Networks (DNNs) are nowadays a common practice in most of the Artificial
Intelligence (AI) applications. Their ability to go beyond human precision has made these …
Intelligence (AI) applications. Their ability to go beyond human precision has made these …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Spatten: Efficient sparse attention architecture with cascade token and head pruning
The attention mechanism is becoming increasingly popular in Natural Language Processing
(NLP) applications, showing superior performance than convolutional and recurrent …
(NLP) applications, showing superior performance than convolutional and recurrent …
Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training
The advent of Deep Learning (DL) has radically transformed the computing industry across
the entire spectrum from algorithms to circuits. As myriad application domains embrace DL, it …
the entire spectrum from algorithms to circuits. As myriad application domains embrace DL, it …
A modern primer on processing in memory
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …
design choice goes directly against at least three key trends in computing that cause …
[BOOK][B] Efficient processing of deep neural networks
This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …
Sparch: Efficient architecture for sparse matrix multiplication
Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous task in various
engineering and scientific applications. However, inner product based SpGEMM introduces …
engineering and scientific applications. However, inner product based SpGEMM introduces …
GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks
Graph convolutional neural networks (GCNs) have emerged as an effective approach to
extend deep learning for graph data analytics. Given that graphs are usually irregular, as …
extend deep learning for graph data analytics. Given that graphs are usually irregular, as …
A survey on deep learning hardware accelerators for heterogeneous hpc platforms
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable
solution for several classes of high-performance computing (HPC) applications such as …
solution for several classes of high-performance computing (HPC) applications such as …