Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …
processing these computational-and memory-intensive applications, tensors of these …
Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity
Due to complex interactions among various deep neural network (DNN) optimization
techniques, modern DNNs can have weights and activations that are dense or sparse with …
techniques, modern DNNs can have weights and activations that are dense or sparse with …
Snicit: Accelerating sparse neural network inference via compression at inference time on gpu
Sparse deep neural network (DNN) has become an important technique for reducing the
inference cost of large DNNs. However, computing large sparse DNNs is very challenging …
inference cost of large DNNs. However, computing large sparse DNNs is very challenging …
Tprune: Efficient transformer pruning for mobile devices
The invention of Transformer model structure boosts the performance of Neural Machine
Translation (NMT) tasks to an unprecedented level. Many previous works have been done to …
Translation (NMT) tasks to an unprecedented level. Many previous works have been done to …
Special session: Approximate TinyML systems: Full system approximations for extreme energy-efficiency in intelligent edge devices
Approximate computing (AxC) has advanced from being an emerging design paradigm to
becoming one of the most popular and effective methods of energy optimization for …
becoming one of the most popular and effective methods of energy optimization for …
[PDF][PDF] Elite BackProp: Training Sparse Interpretable Neurons.
In this paper we present a method called Elite BackProp (EBP) to train more interpretable
convolutional neural networks (CNNs) by introducing class-wise activation sparsity; after …
convolutional neural networks (CNNs) by introducing class-wise activation sparsity; after …
ARTS: An adaptive regularization training schedule for activation sparsity exploration
Brain-inspired event-based processors have attracted considerable attention for edge
deployment because of their ability to efficiently process Convolutional Neural Networks …
deployment because of their ability to efficiently process Convolutional Neural Networks …
Exploiting activation sparsity for fast CNN inference on mobile GPUs
Over the past several years, the need for on-device deep learning has been rapidly
increasing, and efficient CNN inference on mobile platforms has been actively researched …
increasing, and efficient CNN inference on mobile platforms has been actively researched …
Improving Interpretability and Accuracy in Neuro-Symbolic Rule Extraction Using Class-Specific Sparse Filters
There has been significant focus on creating neuro-symbolic models for interpretable image
classification using Convolutional Neural Networks (CNNs). These methods aim to replace …
classification using Convolutional Neural Networks (CNNs). These methods aim to replace …