A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

A Abdelfattah, H Anzt, EG Boman… - … Journal of High …, 2021 - journals.sagepub.com
The efficient utilization of mixed-precision numerical linear algebra algorithms can offer
attractive acceleration to scientific computing applications. Especially with the hardware …

Efficient parallel implementations of sparse triangular solves for GPU architectures

R Li, C Zhang - Proceedings of the 2020 SIAM Conference on Parallel …, 2020 - SIAM
The sparse triangular matrix solve (SpTrSV) is an important computation kernel that is
demanded by a variety of numerical methods such as the Gauss-Seidel iterations. However …

A scalable geometric multigrid solver for nonsymmetric elliptic systems with application to variable-density flows

M Esmaily, L Jofre, A Mani, G Iaccarino - Journal of Computational Physics, 2018 - Elsevier
A geometric multigrid algorithm is introduced for solving nonsymmetric linear systems
resulting from the discretization of the variable density Navier–Stokes equations on …

A new class of amg interpolation methods based on matrix-matrix multiplications

R Li, B Sjogreen, UM Yang - SIAM Journal on Scientific Computing, 2021 - SIAM
A new class of distance-two interpolation methods for algebraic multigrid (AMG) that can be
formulated in terms of sparse matrix-matrix multiplications is presented and analyzed …

Stability analysis and performance evaluation of additive mixed-precision Runge-Kutta methods

B Burnett, S Gottlieb, ZJ Grant - Communications on Applied Mathematics …, 2024 - Springer
Abstract Additive Runge-Kutta methods designed for preserving highly accurate solutions in
mixed-precision computation were previously proposed and analyzed. These specially …

A two-level GPU-accelerated incomplete LU preconditioner for general sparse linear systems

T Xu, RP Li, D Osei-Kuffuor - The International Journal of …, 2023 - journals.sagepub.com
This paper presents a parallel preconditioning approach based on incomplete LU (ILU)
factorizations in the framework of Domain Decomposition (DD) for general sparse linear …

FP16 Acceleration in Structured Multigrid Preconditioner for Real-World Applications

Y Zong, P Yu, H Huang, W Xue - … of the 53rd International Conference on …, 2024 - dl.acm.org
Half-precision hardware support is now almost ubiquitous. In contrast to its active use in AI,
half-precision is less commonly employed in scientific and engineering computing. The …

Pipelined iterative solvers with kernel fusion for graphics processing units

K Rupp, J Weinbub, A Jüngel, T Grasser - ACM Transactions on …, 2016 - dl.acm.org
We revisit the implementation of iterative solvers on discrete graphics processing units and
demonstrate the benefit of implementations using extensive kernel fusion for pipelined …

Accelerating geometric multigrid preconditioning with half-precision arithmetic on GPUs

KL Oo, A Vogel - arxiv preprint arxiv:2007.07539, 2020 - arxiv.org
With the hardware support for half-precision arithmetic on NVIDIA V100 GPUs, high-
performance computing applications can benefit from lower precision at appropriate spots to …

Tusas: A fully implicit parallel approach for coupled phase-field equations

S Ghosh, CK Newman, MM Francois - Journal of computational physics, 2022 - Elsevier
We develop a fully-coupled, fully-implicit approach for phase-field modeling of solidification
in metals and alloys. Predictive simulation of solidification in pure metals and metal alloys …