TLP Balancer: Predictive Thread Allocation for Multi-Tenant Inference in Embedded GPUs

M Gil, J Jeon, J Kim, S Choi, G Koo… - IEEE Embedded …, 2024‏ - ieeexplore.ieee.org
This paper introduces a novel software technique to optimize thread allocation for merged
and fused kernels in multi-tenant inference systems on embedded Graphics Processing …