TransPIM: A memory-based acceleration via software-hardware co-design for transformer

M Zhou, W Xu, J Kang, T Rosing - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Transformer-based models are state-of-the-art for many machine learning (ML) tasks.
Executing Transformer usually requires a long execution time due to the large memory …

HARDSEA: Hybrid analog-ReRAM clustering and digital-SRAM in-memory computing accelerator for dynamic sparse self-attention in transformer

S Liu, C Mu, H Jiang, Y Wang, J Zhang… - … Transactions on Very …, 2023 - ieeexplore.ieee.org
Self-attention-based transformers have outperformed recurrent and convolutional neural
networks (RNN/CNNs) in many applications. Despite the effectiveness, calculating self …

P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way **-Pong Macro for Vision Transformer

X Fu, Q Ren, H Wu, F ** and pipeline optimization
L Han, P Huang, Z Zhou, Y Chen… - 2023 60th ACM/IEEE …, 2023 - ieeexplore.ieee.org
The pipeline is an efficient solution to boost performance in non-volatile memory based
computing in memory (nvCIM) convolution neural network (CNN) accelerators. However, the …

E-MAC: Enhanced In-SRAM MAC Accuracy via Digital-to-Time Modulation

S Seyedfaraji, S Shakibhamedan… - IEEE Journal on …, 2024 - ieeexplore.ieee.org
In this article, we introduce a novel technique called E-multiplication and accumulation
(MAC)(EMAC), aimed at enhancing energy efficiency, reducing latency, and improving the …

Allspark: Workload Orchestration for Visual Transformers on Processing In-Memory Systems

M Ge, J Wang, B Chen, Y Zhong, H Du… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
The advent of Transformers has revolutionized computer vision, offering a powerful
alternative to convolutional neural networks (CNNs), especially with the local attention …

[КНИГА][B] Software-Hardware Co-design for Processing In-Memory Accelerators

M Zhou - 2023 - search.proquest.com
The explosive increase in data volume in emerging applications poses grand challenges to
computing systems because the bandwidth between compute and memory cannot keep up …

Accelerating Neural Network Training with Processing-in-Memory GPU

X Fei, J Han, J Huang, W Zheng… - 2022 22nd IEEE …, 2022 - ieeexplore.ieee.org
Processing-in-memory (PIM) architecture is promising for accelerating deep neural network
(DNN) training due to its low-latency and energy-efficient data movement between …