[PDF][PDF] Taking GPU Programming Models to Task for Performance Portability

JH Davis, P Sivaraman, J Kitson… - ar** and
maintaining a single codebase that can run efficiently on a range of accelerator-based …

Analyzing the Performance Portability of SYCL across CPUs, GPUs, and Hybrid Systems with Protein Database Search

M Costanzo, E Rucci, C García-Sánchez… - arxiv preprint arxiv …, 2024 - arxiv.org
The high-performance computing (HPC) landscape is undergoing rapid transformation, with
an increasing emphasis on energy-efficient and heterogeneous computing environments …

On the Inorrect Use of Application Efficiency to Calculate Performance Portability

A Marowka - arxiv preprint arxiv:2407.00232, 2024 - arxiv.org
The emergence of heterogeneity in high-performance computing, which harnesses under
one integrated system several platforms of different architectures, also led to the …

Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL

LA Torres, Y Denneulin - arxiv preprint arxiv:2405.17322, 2024 - arxiv.org
Matrix multiplication is fundamental in the backpropagation algorithm used to train deep
neural network models. Libraries like Intel's MKL or NVIDIA's cuBLAS implemented new and …

Ponte Vecchio Across the Atlantic: Single-Node Benchmarking of Two Intel GPU Systems

T Applencourt, A Sadawarte… - SC24-W: Workshops …, 2024 - ieeexplore.ieee.org
Intel Data Center GPU Max 1550, known as Ponte Vecchio (PVC), is a new Intel GPU
architecture for high-performance computing. It is the basis of two systems on the June 2024 …

Development of performance portable spline solver for exa-scale plasma turbulence simulation

Y Asahi, B Legouix, E Bourne… - SC24-W: Workshops …, 2024 - ieeexplore.ieee.org
This paper describes the development of performance portable spline building kernels on
top of Kokkos-kernels. We wish to solve a single matrix equation with multiple right-hand …

GenVectorX: A performance-portable SYCL library for Lorentz Vectors operations

M Dessole, J Chen, A Naumann - arxiv preprint arxiv:2312.02756, 2023 - arxiv.org
The Large Hadron Collider (LHC) at CERN will see an upgraded hardware configuration
which will bring a new era of physics data taking and related computational challenges. To …

Unlocking performance portability on LUMI-G supercomputer: A virtual screening case study

G Accordi, D Gadioli, G Palermo, L Crisci… - Proceedings of the 12th …, 2024 - dl.acm.org
High-Performance Computing is the target system for virtual screening applications, which
aim to suggest which candidates to test in the drug discovery process. The HPC …

Experiences with implementing Kokkos' SYCL backend

D Arndt, D Lebrun-Grandie, C Trott - Proceedings of the 12th …, 2024 - dl.acm.org
With the recent diversification of the hardware landscape in the high-performance computing
community, performance-portability solutions are becoming more and more important. One …

[PDF][PDF] Portability Efficiency Approach for Calculating Performance Portability

A Marowka - researchgate.net
The emergence of heterogeneity in high-performance computing, which harnesses under
one integrated system several platforms of different architectures, also led to the …