Mind the gap: Attainable data movement and operational intensity bounds for tensor algorithms

Q Huang, PA Tsai, JS Emer… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
The architectural design-space exploration (or DSE) process-whether manual or automated-
benefits greatly from knowing the limits of the metrics of interest in advance. Data movement …

Fusemax: Leveraging extended einsums to optimize attention accelerator design

N Nayak, X Wu, TO Odemuyiwa… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Attention for transformers is a critical workload that has recently received significant
'attention'as a target for custom acceleration. Yet, while prior work succeeds in reducing …

LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space

M Gilbert, YN Wu, JS Emer… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Latency and energy consumption are key metrics in the performance of deep neural network
(DNN) accelerators. A significant factor contributing to latency and energy is data transfers …