Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity

YN Wu, PA Tsai, S Muralidharan, A Parashar… - Proceedings of the 56th …, 2023 - dl.acm.org
Due to complex interactions among various deep neural network (DNN) optimization
techniques, modern DNNs can have weights and activations that are dense or sparse with …

Snicit: Accelerating sparse neural network inference via compression at inference time on gpu

S Jiang, TW Huang, B Yu, TY Ho - Proceedings of the 52nd International …, 2023 - dl.acm.org
Sparse deep neural network (DNN) has become an important technique for reducing the
inference cost of large DNNs. However, computing large sparse DNNs is very challenging …

Tprune: Efficient transformer pruning for mobile devices

J Mao, H Yang, A Li, H Li, Y Chen - ACM Transactions on Cyber …, 2021 - dl.acm.org
The invention of Transformer model structure boosts the performance of Neural Machine
Translation (NMT) tasks to an unprecedented level. Many previous works have been done to …

Special session: Approximate TinyML systems: Full system approximations for extreme energy-efficiency in intelligent edge devices

A Raha, S Ghosh, D Mohapatra… - 2021 IEEE 39th …, 2021 - ieeexplore.ieee.org
Approximate computing (AxC) has advanced from being an emerging design paradigm to
becoming one of the most popular and effective methods of energy optimization for …

[PDF][PDF] Elite BackProp: Training Sparse Interpretable Neurons.

T Kasioumis, J Townsend, H Inakoshi - NeSy, 2021 - lr2020.iit.demokritos.gr
In this paper we present a method called Elite BackProp (EBP) to train more interpretable
convolutional neural networks (CNNs) by introducing class-wise activation sparsity; after …

ARTS: An adaptive regularization training schedule for activation sparsity exploration

Z Zhu, A Pourtaherian, L Waeijen… - 2022 25th Euromicro …, 2022 - ieeexplore.ieee.org
Brain-inspired event-based processors have attracted considerable attention for edge
deployment because of their ability to efficiently process Convolutional Neural Networks …

Exploiting activation sparsity for fast CNN inference on mobile GPUs

C Oh, J So, S Kim, Y Yi - ACM Transactions on Embedded Computing …, 2021 - dl.acm.org
Over the past several years, the need for on-device deep learning has been rapidly
increasing, and efficient CNN inference on mobile platforms has been actively researched …

Improving Interpretability and Accuracy in Neuro-Symbolic Rule Extraction Using Class-Specific Sparse Filters

P Padalkar, J Lee, S Wei, G Gupta - arxiv preprint arxiv:2501.16677, 2025 - arxiv.org
There has been significant focus on creating neuro-symbolic models for interpretable image
classification using Convolutional Neural Networks (CNNs). These methods aim to replace …