[HTML][HTML] Kernel Tuner: A search-optimizing GPU code auto-tuner
B van Werkhoven - Future Generation Computer Systems, 2019 - Elsevier
A very common problem in GPU programming is that some combination of thread block
dimensions and other code optimization parameters, like tiling or unrolling factors, results in …
dimensions and other code optimization parameters, like tiling or unrolling factors, results in …
GPGPU performance estimation with core and memory frequency scaling
Contemporary graphics processing units (GPUs) support dynamic voltage and frequency
scaling to balance computational performance and energy consumption. However, accurate …
scaling to balance computational performance and energy consumption. However, accurate …
Bayesian Optimization for auto-tuning GPU kernels
Finding optimal parameter configurations for tunable GPU kernels is a non-trivial exercise
for large search spaces, even when automated. This poses an optimization task on a …
for large search spaces, even when automated. This poses an optimization task on a …
Benchmarking optimization algorithms for auto-tuning GPU kernels
Recent years have witnessed phenomenal growth in the application, and capabilities of
graphical processing units (GPUs) due to their high parallel computation power at relatively …
graphical processing units (GPUs) due to their high parallel computation power at relatively …
Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning
R Schoonhoven, B Veenboer… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) have revolutionized the computing landscape over the
past decade. However, the growing energy demands of data centres and computing …
past decade. However, the growing energy demands of data centres and computing …
cstuner: Scalable auto-tuning framework for complex stencil computation on gpus
Q Sun, Y Liu, H Yang, Z Jiang, X Liu… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
The computational patterns of stencil operations are commonly used in HPC applications.
Many HPC platforms utilize the computation capability of GPUs to accelerate stencil …
Many HPC platforms utilize the computation capability of GPUs to accelerate stencil …
LS-CAT: a large-scale CUDA AutoTuning dataset
The effectiveness of Machine Learning (ML) methods depend on access to large suitable
datasets. In this article, we present how we build the LS-CAT (Large-Scale CUDA …
datasets. In this article, we present how we build the LS-CAT (Large-Scale CUDA …
Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs
Stencil computations are widely used in high performance computing (HPC) applications.
Many HPC platforms utilize the high computation capability of GPUs to accelerate stencil …
Many HPC platforms utilize the high computation capability of GPUs to accelerate stencil …
Coding Ants: Optimization of GPU code using ant colony optimization
This article proposes the Coding Ants framework, an approach for auto-tuning which uses
ant colony optimization to find a sequence of code optimizations for GPU architectures. The …
ant colony optimization to find a sequence of code optimizations for GPU architectures. The …
Optimal Kernel Tuning Parameter Prediction using Deep Sequence Models
GPU kernels have come to the forefront of comput-ing due to their utility in varied fields, from
high-performance computing to machine learning. A typical GPU compute kernel is invoked …
high-performance computing to machine learning. A typical GPU compute kernel is invoked …