Just-in-time compilation and link-time optimization for openmp target offloading

S Tian, J Huber, J Tramm, B Chapman… - International Workshop on …, 2022 - Springer
Following the mass adoption of external accelerators for high performance computing, the
overall performance of many applications has become increasingly dependent on relatively …

Enhancing heterogeneous computing through OpenMP and GPU graph

C Yu, S Royuela, E Quiñones - … of the 53rd International Conference on …, 2024 - dl.acm.org
Modern computing platforms are increasingly heterogeneous, most of them include
accelerators such as GPU. OpenMP as the de-facto standard to parallelize CPU …

Direct GPU compilation and execution for host applications with OpenMP Parallelism

S Tian, J Huber, K Parasyris… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
Currently, offloading to accelerators requires users to identify which regions are to be
executed on the device, what memory needs to be transferred, and how synchronization is …

Hybrid PTX analysis for GPU accelerated CNN inferencing aiding computer architecture design

CA Metz, C Plump, BJ Berger… - 2023 Forum on …, 2023 - ieeexplore.ieee.org
General-Purpose Computation on Graphics Processing Units (GPGPUs) are becoming
crucial in accelerating computing capacity. Due to the massive parallelism capabilities of …

OpenMP kernel language extensions for performance portable GPU codes

S Tian, T Scogland, B Chapman… - … of the SC'23 Workshops of …, 2023 - dl.acm.org
In contemporary high-performance computing architectures, the integration of GPU
accelerators has become increasingly prevalent. To harness the full potential of these …

The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned

R Gayatri, SL Olivier, CR Trott, J Doerfert… - … Workshop on OpenMP, 2023 - Springer
As the supercomputing landscape diversifies, solutions such as Kokkos to write vendor
agnostic applications and libraries have risen in popularity. Kokkos provides a programming …

GPU First--Execution of Legacy CPU Codes on GPUs

S Tian, T Scogland, B Chapman, J Doerfert - arxiv preprint arxiv …, 2023 - arxiv.org
Utilizing GPUs is critical for high performance on heterogeneous systems. However,
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …

Specialized Kernels for Optimizing GPU Offload in OpenMP

D Chakrabarti, G Rodgers, C Bertolli… - Proceedings of the SC' …, 2023 - dl.acm.org
Programming models for general purpose GPU (GPGPU) computing include grid and non-
grid languages. Grid languages like CUDA and HIP map directly to the GPU hardware and …

Towards a Scalable and Efficient PGAS-Based Distributed OpenMP

B Shan, M Araya-Polo, B Chapman - International Workshop on OpenMP, 2024 - Springer
MPI+ X has been the de facto standard for distributed memory parallel programming. It is
widely used primarily as an explicit two-sided communication model, which often leads to …

Evaluation of Programming Models and Performance for Stencil Computation on GPGPUs

B Shan, M Araya-Polo - 2024 IEEE International Parallel and …, 2024 - ieeexplore.ieee.org
GPGPUs are widely used in high-performance computing. Therefore, it is crucial to
experiment and discover how to better utilize their latest generations of relevant …