Just-in-time compilation and link-time optimization for openmp target offloading
Following the mass adoption of external accelerators for high performance computing, the
overall performance of many applications has become increasingly dependent on relatively …
overall performance of many applications has become increasingly dependent on relatively …
Enhancing heterogeneous computing through OpenMP and GPU graph
Modern computing platforms are increasingly heterogeneous, most of them include
accelerators such as GPU. OpenMP as the de-facto standard to parallelize CPU …
accelerators such as GPU. OpenMP as the de-facto standard to parallelize CPU …
Direct GPU compilation and execution for host applications with OpenMP Parallelism
Currently, offloading to accelerators requires users to identify which regions are to be
executed on the device, what memory needs to be transferred, and how synchronization is …
executed on the device, what memory needs to be transferred, and how synchronization is …
Hybrid PTX analysis for GPU accelerated CNN inferencing aiding computer architecture design
General-Purpose Computation on Graphics Processing Units (GPGPUs) are becoming
crucial in accelerating computing capacity. Due to the massive parallelism capabilities of …
crucial in accelerating computing capacity. Due to the massive parallelism capabilities of …
OpenMP kernel language extensions for performance portable GPU codes
In contemporary high-performance computing architectures, the integration of GPU
accelerators has become increasingly prevalent. To harness the full potential of these …
accelerators has become increasingly prevalent. To harness the full potential of these …
The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned
As the supercomputing landscape diversifies, solutions such as Kokkos to write vendor
agnostic applications and libraries have risen in popularity. Kokkos provides a programming …
agnostic applications and libraries have risen in popularity. Kokkos provides a programming …
GPU First--Execution of Legacy CPU Codes on GPUs
Utilizing GPUs is critical for high performance on heterogeneous systems. However,
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …
Specialized Kernels for Optimizing GPU Offload in OpenMP
D Chakrabarti, G Rodgers, C Bertolli… - Proceedings of the SC' …, 2023 - dl.acm.org
Programming models for general purpose GPU (GPGPU) computing include grid and non-
grid languages. Grid languages like CUDA and HIP map directly to the GPU hardware and …
grid languages. Grid languages like CUDA and HIP map directly to the GPU hardware and …
Towards a Scalable and Efficient PGAS-Based Distributed OpenMP
MPI+ X has been the de facto standard for distributed memory parallel programming. It is
widely used primarily as an explicit two-sided communication model, which often leads to …
widely used primarily as an explicit two-sided communication model, which often leads to …
Evaluation of Programming Models and Performance for Stencil Computation on GPGPUs
GPGPUs are widely used in high-performance computing. Therefore, it is crucial to
experiment and discover how to better utilize their latest generations of relevant …
experiment and discover how to better utilize their latest generations of relevant …