A systematic survey of general sparse matrix-matrix multiplication
General Sparse Matrix-Matrix Multiplication (SpGEMM) has attracted much attention from
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …
The kokkos ecosystem: Comprehensive performance portability for high performance computing
C Trott, L Berger-Vergiat, D Poliakoff… - … in Science & …, 2021 - ieeexplore.ieee.org
State-of-the-art engineering and science codes have grown in complexity dramatically over
the last two decades. Application teams have adopted more sophisticated development …
the last two decades. Application teams have adopted more sophisticated development …
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes
WF Godoy, P Valero-Lara, TE Dettling… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
We explore the performance and portability of the high-level programming models: the
LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) …
LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) …
Trust: Triangle Counting Reloaded on GPUs
Triangle counting is a building block for a wide range of graph applications. Traditional
wisdom suggests that i) hashing is not suitable for triangle counting, ii) edge-centric triangle …
wisdom suggests that i) hashing is not suitable for triangle counting, ii) edge-centric triangle …
A performance portability framework for Python
Kokkos is a programming model for writing performance portable applications for all major
high performance computing platforms. It provides abstractions for data management and …
high performance computing platforms. It provides abstractions for data management and …
Kokkacc: Enhancing kokkos with openacc
P Valero-Lara, S Lee… - 2022 Workshop on …, 2022 - ieeexplore.ieee.org
Template metaprogramming is gaining popularity as a high-level solution for achieving
performance portability on heterogeneous computing resources. Kokkos is a representative …
performance portability on heterogeneous computing resources. Kokkos is a representative …
[HTML][HTML] On A Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures
This paper presents software advances to easily exploit computer architectures consisting of
a multi-core CPU and CPU+ GPU to accelerate diverse types of high-performance …
a multi-core CPU and CPU+ GPU to accelerate diverse types of high-performance …
Towards EXtreme scale technologies and accelerators for euROhpc hw/Sw supercomputing applications for exascale: The TEXTAROSSA approach
In the near future, Exascale systems will need to bridge three technology gaps to achieve
high performance while remaining under tight power constraints: energy efficiency and …
high performance while remaining under tight power constraints: energy efficiency and …
FunMC^ 2: A Filter for Uncertainty Visualization of Marching Cubes on Multi-Core Devices
Visualization is an important tool for scientists to extract understanding from complex
scientific data. Scientists need to understand the uncertainty inherent in all scientific data in …
scientific data. Scientists need to understand the uncertainty inherent in all scientific data in …
Performance portable batched sparse linear solvers
K Liegeois, S Rajamanickam… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Solving large number of small linear systems is increasingly becoming a bottleneck in
computational science applications. While dense linear solvers for such systems have been …
computational science applications. While dense linear solvers for such systems have been …