Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms

Y Xue, Y Liu, L Nai, J Huang - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Cloud platforms today have been deploying hardware accelerators like neural processing
units (NPUs) for powering machine learning (ML) inference services. To maximize the …

CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single-and Multi-DNN Workloads

I Panopoulos, S Venieris, I Venieris - ACM Transactions on Embedded …, 2024 - dl.acm.org
The relentless expansion of deep learning (DL) applications in recent years has prompted a
pivotal shift towards on-device execution, driven by the urgent need for real-time processing …

Enabling Heterogeneous Computing for Software Developers

J Lau - 2024 - search.proquest.com
The slowing of CMOS technology scaling mismatches the ever-increasing demand for
computational power, leading to a rise in the use of heterogeneous systems, which pair …