Mixed precision algorithms in numerical linear algebra
Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …
has traditionally used single precision and double precision floating-point arithmetics, half …
Efficient mixed-precision matrix factorization of the inverse overlap matrix in electronic structure calculations with AI-hardware and GPUs
In recent years, a new kind of accelerated hardware has gained popularity in the artificial
intelligence (AI) community which enables extremely high-performance tensor contractions …
intelligence (AI) community which enables extremely high-performance tensor contractions …
CPFloat: AC library for simulating low-precision arithmetic
One can simulate low-precision floating-point arithmetic via software by executing each
arithmetic operation in hardware and then rounding the result to the desired number of …
arithmetic operation in hardware and then rounding the result to the desired number of …
Accurate calculation of Euclidean Norms using Double-word arithmetic
We consider the computation of the Euclidean (or L2) norm of an n-dimensional vector in
floating-point arithmetic. We review the classical solutions used to avoid spurious overflow …
floating-point arithmetic. We review the classical solutions used to avoid spurious overflow …
Mixed precision randomized low-rank approximation with GPU tensor cores
Randomized projection methods have been shown to be very efficient at computing low-
rank approximations (LRA) of large matrices. In this work, we investigate the design and …
rank approximations (LRA) of large matrices. In this work, we investigate the design and …
LE-GEMM: A lightweight emulation-based GEMM with precision refinement on GPU
Y Zhang, L Lu, Z Yang, Z Liang, S Suo - Journal of Systems Architecture, 2025 - Elsevier
Many special hardware units, such as Matrix Core and Tensor Core, have recently been
designed and applied in various scientific computing scenarios. These units support tensor …
designed and applied in various scientific computing scenarios. These units support tensor …
Susceptibility formulation of density matrix perturbation theory
Density matrix perturbation theory based on recursive Fermi-operator expansions provides a
computationally efficient framework for time-independent response calculations in quantum …
computationally efficient framework for time-independent response calculations in quantum …
Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration
In this paper we develop the first fine-grained rounding error analysis of finite element (FE)
cell kernels and assembly. The theory includes mixed-precision implementations and …
cell kernels and assembly. The theory includes mixed-precision implementations and …
Mixed-precision numerics in scientific applications: survey and perspectives
The explosive demand for artificial intelligence (AI) workloads has led to a significant
increase in silicon area dedicated to lower-precision computations on recent high …
increase in silicon area dedicated to lower-precision computations on recent high …
Monotonicity of Multi-Term Floating-Point Adders
M Mikaitis - IEEE Transactions on Computers, 2024 - ieeexplore.ieee.org
In the literature on algorithms for computing multi-term addition in floating-point arithmetic it
is often shown that a hardware unit that has single normalization and rounding improves …
is often shown that a hardware unit that has single normalization and rounding improves …