A multi-architecture approach for implicit computational fluid dynamics on unstructured grids

G Nastac, A Walden, L Wang, EJ Nielsen… - AIAA SCITECH 2023 …, 2023 - arc.aiaa.org
View Video Presentation: https://doi. org/10.2514/6.2023-1226. vid High-performance
computing (HPC) architectures are trending toward manycore paradigms such as graphics …

Exploring Numba and CuPy for GPU-Accelerated Monte Carlo Radiation Transport

T Askar, A Yergaliyev, B Shukirgaliyev, E Abdikamalov - Computation, 2024 - mdpi.com
This paper examines the performance of two popular GPU programming platforms, Numba
and CuPy, for Monte Carlo radiation transport calculations. We conducted tests involving …

Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications

S Shanmugavelu, M Taillefumier… - SC24-W: Workshops …, 2024 - ieeexplore.ieee.org
Run to run variability in parallel programs caused by floating-point non-associativity has
been known to significantly affect reproducibility in iterative algorithms, due to accumulating …

Static Generation of Efficient OpenMP Offload Data Map**s

L Marzen, A Dutta, A Jannesari - … : International Conference for …, 2024 - ieeexplore.ieee.org
Increasing heterogeneity in HPC architectures and compiler advancements have led to
OpenMP being frequently used to enable computations on heterogeneous devices …

[PDF][PDF] Investigating the hip programming model with regards to portability and performance portability

N Kerscher - 2022 - events.gwdg.de
While modern HPC systems are being dominated by NVIDIA GPUs, new vendors such as
AMD and Intel are entering the field, which creates the problem of software portability across …

OpenMP's Asynchronous Offloading for All-pairs Shortest Path Graph Algorithms on GPUs

M Thavappiragasam, V Kale - 2022 IEEE/ACM International …, 2022 - ieeexplore.ieee.org
Numerical scientific computations, which are based on floating-point operations, have been
sped up greatly via GPUs or other accelerators of supercomputers. However, combinatorial …