Spvit: Enabling faster vision transformers via latency-aware soft token pruning
Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …
the computer vision field, while the high computation and memory cost makes its …
RAELLA: Reforming the arithmetic for efficient, low-resolution, and low-loss analog PIM: No retraining required!
Processing-In-Memory (PIM) accelerators have the potential to efficiently run Deep Neural
Network (DNN) inference by reducing costly data movement and by using resistive RAM …
Network (DNN) inference by reducing costly data movement and by using resistive RAM …
An overview of sparsity exploitation in CNNs for on-device intelligence with software-hardware cross-layer optimizations
This paper presents a detailed overview of sparsity exploitation in deep neural network
(DNN) accelerators. Despite the algorithmic advancements which drove DNNs to become …
(DNN) accelerators. Despite the algorithmic advancements which drove DNNs to become …
STICKER-IM: A 65 nm computing-in-memory NN processor using block-wise sparsity optimization and inter/intra-macro data reuse
Computing-in-memory (CIM) is a promising architecture for energy-efficient neural network
(NN) processors. Several CIM macros have demonstrated high energy efficiency, while CIM …
(NN) processors. Several CIM macros have demonstrated high energy efficiency, while CIM …
Structured pruning of RRAM crossbars for efficient in-memory computing acceleration of deep neural networks
The high computational complexity and a large number of parameters of deep neural
networks (DNNs) become the most intensive burden of deep learning hardware design …
networks (DNNs) become the most intensive burden of deep learning hardware design …
AUTO-PRUNE: Automated DNN pruning and map** for ReRAM-based accelerator
Emergent ReRAM-based accelerators support in-memory computation to accelerate deep
neural network (DNN) inference. Weight matrix pruning of DNNs is a widely used technique …
neural network (DNN) inference. Weight matrix pruning of DNNs is a widely used technique …
On-fiber photonic computing
In the 1800s, Charles Babbage envisioned computers as analog devices. However, it was
not until 150 years later that a Mechanical Analog Computer was constructed for the US …
not until 150 years later that a Mechanical Analog Computer was constructed for the US …
Exploring compute-in-memory architecture granularity for structured pruning of neural networks
Compute-in-Memory (CIM) implemented with Resistive-Random-Access-Memory (RRAM)
crossbars is a promising approach for Deep Neural Network (DNN) acceleration. As the …
crossbars is a promising approach for Deep Neural Network (DNN) acceleration. As the …
Designing efficient bit-level sparsity-tolerant memristive networks
With the rapid progress of deep neural network (DNN) applications on memristive platforms,
there has been a growing interest in the acceleration and compression of memristive …
there has been a growing interest in the acceleration and compression of memristive …
Bit-transformer: Transforming bit-level sparsity into higher preformance in reram-based accelerator
Resistive Random-Access-Memory (ReRAM) crossbar is one of the most promising neural
network accelerators, thanks to its in-memory and in-situ analog computing abilities for …
network accelerators, thanks to its in-memory and in-situ analog computing abilities for …