The sparse polyhedral framework: Composing compiler-generated inspector-executor code

MM Strout, M Hall, C Olschanowsky - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org
Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …

A recursive algebraic coloring technique for hardware-efficient symmetric sparse matrix-vector multiplication

C Alappat, A Basermann, AR Bishop… - ACM Transactions on …, 2020 - dl.acm.org
The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building
block for many numerical linear algebra kernel operations or graph traversal applications …

A synchronization-free algorithm for parallel sparse triangular solves

W Liu, A Li, J Hogg, IS Duff, B Vinter - … August 24-26, 2016, Proceedings 22, 2016 - Springer
The sparse triangular solve kernel, SpTRSV, is an important building block for a number of
numerical linear algebra routines. Parallelizing SpTRSV on today's manycore platforms …

Iterative sparse triangular solves for preconditioning

H Anzt, E Chow, J Dongarra - Euro-Par 2015: Parallel Processing: 21st …, 2015 - Springer
Sparse triangular solvers are typically parallelized using level-scheduling techniques, but
parallel efficiency is poor on high-throughput architectures like GPUs. We propose using an …

Enabling and scaling the hpcg benchmark on the newest generation sunway supercomputer with 42 million heterogeneous cores

Q Zhu, H Luo, C Yang, M Ding, W Yin… - Proceedings of the …, 2021 - dl.acm.org
We study and evaluate performance optimization techniques for the HPCG benchmark on
the newest generation Sunway supercomputer. Specifically, a two-level blocking scheme is …

Incomplete sparse approximate inverses for parallel preconditioning

H Anzt, TK Huckle, J Bräckle, J Dongarra - Parallel Computing, 2018 - Elsevier
In this paper, we propose a new preconditioning method that can be seen as a
generalization of block-Jacobi methods, or as a simplification of the sparse approximate …

swSpTRSV: A fast sparse triangular solve with sparse level tile layout on sunway architectures

X Wang, W Liu, W Xue, L Wu - Proceedings of the 23rd ACM SIGPLAN …, 2018 - dl.acm.org
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …

Automating wavefront parallelization for sparse matrix computations

A Venkat, MS Mohammadi, J Park… - SC'16: Proceedings …, 2016 - ieeexplore.ieee.org
This paper presents a compiler and runtime framework for parallelizing sparse matrix
computations that have loop-carried dependences. Our approach automatically generates a …

Performance optimizations for scalable implicit RANS calculations with SU2

TD Economon, D Mudigere, G Bansal, A Heinecke… - Computers & …, 2016 - Elsevier
In this paper, we present single-and multi-node optimizations of SU2, a widely-used, open-
source Computational Fluid Dynamics application, aimed at improving performance and …

Fast synchronization‐free algorithms for parallel sparse triangular solves with multiple right‐hand sides

W Liu, A Li, JD Hogg, IS Duff… - … and Computation: Practice …, 2017 - Wiley Online Library
The sparse triangular solve kernels, SpTRSV and SpTRSM, are important building blocks for
a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on …