[PDF][PDF] Taking GPU Programming Models to Task for Performance Portability
Performance analysis of matrix-free conjugate gradient kernels using SYCL
We examine the performance of matrix-free SYCL implementations of the conjugate gradient
method for solving sparse linear systems of equations. Performance is tested on an NVIDIA …
method for solving sparse linear systems of equations. Performance is tested on an NVIDIA …
Towards Alignment of Parallelism in SYCL and ISO C++
SYCL began as a C++ abstraction for OpenCL concepts, whereas parallelism in ISO C++
evolved from the algorithms in the standard library. This history has resulted in the two …
evolved from the algorithms in the standard library. This history has resulted in the two …
An Evaluative Comparison of Performance Portability across GPU Programming Models
JH Davis, P Sivaraman, I Minn, K Parasyris… - ar** and
maintaining a single codebase that can run efficiently on a range of accelerator-based …
maintaining a single codebase that can run efficiently on a range of accelerator-based …
On the Inorrect Use of Application Efficiency to Calculate Performance Portability
A Marowka - arxiv preprint arxiv:2407.00232, 2024 - arxiv.org
The emergence of heterogeneity in high-performance computing, which harnesses under
one integrated system several platforms of different architectures, also led to the …
one integrated system several platforms of different architectures, also led to the …
Sum Reduction with OpenMP Offload on NVIDIA Grace-Hopper System
Z ** - SC24-W: Workshops of the International Conference …, 2024 - ieeexplore.ieee.org
Sum reduction is a primitive operation in parallel computing. With OpenMP directives that
enable data and computation offload to a graphics processing unit (GPU), we annotate the …
enable data and computation offload to a graphics processing unit (GPU), we annotate the …
[PDF][PDF] Portability Efficiency Approach for Calculating Performance Portability
A Marowka - researchgate.net
The emergence of heterogeneity in high-performance computing, which harnesses under
one integrated system several platforms of different architectures, also led to the …
one integrated system several platforms of different architectures, also led to the …
[PDF][PDF] FPGA-Based Hardware Acceleration of Canny Image Edge Detector Using SYCL
M Hashemi - 2022 - scholar.uwindsor.ca
Detecting edges is one of the most fundamental algorithms in image processing, in many
fields of science ranging from space exploration imaging and radar applications, to medical …
fields of science ranging from space exploration imaging and radar applications, to medical …