A survey of CPU-GPU heterogeneous computing techniques
As both CPUs and GPUs become employed in a wide range of applications, it has been
acknowledged that both of these Processing Units (PUs) have their unique features and …
acknowledged that both of these Processing Units (PUs) have their unique features and …
Dandelion: a compiler and runtime for heterogeneous systems
Computer systems increasingly rely on heterogeneity to achieve greater performance,
scalability and energy efficiency. Because heterogeneous systems typically comprise …
scalability and energy efficiency. Because heterogeneous systems typically comprise …
CLTune: A generic auto-tuner for OpenCL kernels
C Nugteren, V Codreanu - 2015 IEEE 9th International …, 2015 - ieeexplore.ieee.org
This work presents CLTune, an auto-tuner for OpenCL kernels. It evaluates and tunes kernel
performance of a generic, user-defined search space of possible parameter-value …
performance of a generic, user-defined search space of possible parameter-value …
[HTML][HTML] Kernel Tuner: A search-optimizing GPU code auto-tuner
B van Werkhoven - Future Generation Computer Systems, 2019 - Elsevier
A very common problem in GPU programming is that some combination of thread block
dimensions and other code optimization parameters, like tiling or unrolling factors, results in …
dimensions and other code optimization parameters, like tiling or unrolling factors, results in …
IRIS: A portable runtime system exploiting multiple heterogeneous programming systems
Across embedded, mobile, enterprise, and high performance computing systems, computer
architectures are becoming more heterogeneous and complex. This complexity is causing a …
architectures are becoming more heterogeneous and complex. This complexity is causing a …
Zorua: A holistic approach to resource virtualization in GPUs
This paper introduces a new resource virtualization framework, Zorua, that decouples the
programmer-specified resource usage of a GPU application from the actual allocation in the …
programmer-specified resource usage of a GPU application from the actual allocation in the …
Simplifying programming and load balancing of data parallel applications on heterogeneous systems
Heterogeneous architectures have experienced a great development thanks to their
excellent cost/performance ratio and low power consumption. But heterogeneity significantly …
excellent cost/performance ratio and low power consumption. But heterogeneity significantly …
[HTML][HTML] Sigmoid: An auto-tuned load balancing algorithm for heterogeneous systems
A challenge that heterogeneous system programmers face is leveraging the performance of
all the devices that integrate the system. This paper presents Sigmoid, a new load balancing …
all the devices that integrate the system. This paper presents Sigmoid, a new load balancing …
Bayesian Optimization for auto-tuning GPU kernels
Finding optimal parameter configurations for tunable GPU kernels is a non-trivial exercise
for large search spaces, even when automated. This poses an optimization task on a …
for large search spaces, even when automated. This poses an optimization task on a …
IRIS: A performance-portable framework for cross-platform heterogeneous computing
From edge to exascale, computer architectures are becoming more heterogeneous and
complex. The systems typically have fat nodes, with multicore CPUs and multiple hardware …
complex. The systems typically have fat nodes, with multicore CPUs and multiple hardware …