A survey of numerical linear algebra methods utilizing mixed-precision arithmetic
The efficient utilization of mixed-precision numerical linear algebra algorithms can offer
attractive acceleration to scientific computing applications. Especially with the hardware …
attractive acceleration to scientific computing applications. Especially with the hardware …
Mixed precision algorithms in numerical linear algebra
Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …
has traditionally used single precision and double precision floating-point arithmetics, half …
Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques
Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, and face recognition, among …
virtual assistants, social network services, healthcare services, and face recognition, among …
Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance
Tensor Core is a mixed-precision matrix–matrix multiplication unit on NVIDIA GPUs with a
theoretical peak performance of more than 300 TFlop/s on Ampere architectures. Tensor …
theoretical peak performance of more than 300 TFlop/s on Ampere architectures. Tensor …
Toward performance-portable PETSc for GPU-based exascale systems
Abstract The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers
scalable solvers for nonlinear time-dependent differential and algebraic equations and for …
scalable solvers for nonlinear time-dependent differential and algebraic equations and for …
Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems
Double-precision floating-point arithmetic (FP64) has been the de facto standard for
engineering and scientific simulations for several decades. Problem complexity and the …
engineering and scientific simulations for several decades. Problem complexity and the …
DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication (SpMV) plays a key role in computational science and
engineering, graph processing, and machine learning applications. Much work on SpMV …
engineering, graph processing, and machine learning applications. Much work on SpMV …
Mixed precision low-rank approximations and their application to block low-rank LU factorization
We introduce a novel approach to exploit mixed precision arithmetic for low-rank
approximations. Our approach is based on the observation that singular vectors associated …
approximations. Our approach is based on the observation that singular vectors associated …
Sharper probabilistic backward error analysis for basic linear algebra kernels with random data
Standard backward error analyses for numerical linear algebra algorithms provide worst-
case bounds that can significantly overestimate the backward error. Our recent probabilistic …
case bounds that can significantly overestimate the backward error. Our recent probabilistic …
Numerical behavior of NVIDIA tensor cores
We explore the floating-point arithmetic implemented in the NVIDIA tensor cores, which are
hardware accelerators for mixed-precision matrix multiplication available on the Volta …
hardware accelerators for mixed-precision matrix multiplication available on the Volta …