LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space
Latency and energy consumption are key metrics in the performance of deep neural network
(DNN) accelerators. A significant factor contributing to latency and energy is data transfers …
(DNN) accelerators. A significant factor contributing to latency and energy is data transfers …
Energy Cost Modelling for Optimizing Large Language Model Inference on Hardware Accelerators
The rise of Large Language Models (LLMs) has significantly escalated the demand for
efficient LLM inference, primarily fulfilled through cloud-based GPU computing. This …
efficient LLM inference, primarily fulfilled through cloud-based GPU computing. This …