- Academic Search

NJ Higham, T Mary - Acta Numerica, 2022 - cambridge.org

Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …

Save Cite Cited by 130 Related articles All 17 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Efficient mixed-precision matrix factorization of the inverse overlap matrix in electronic structure calculations with AI-hardware and GPUs

A Habib, J Finkelstein… - Journal of Chemical Theory …, 2024 - ACS Publications

In recent years, a new kind of accelerated hardware has gained popularity in the artificial
intelligence (AI) community which enables extremely high-performance tensor contractions …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

CPFloat: AC library for simulating low-precision arithmetic

M Fasi, M Mikaitis - ACM Transactions on Mathematical Software, 2023 - dl.acm.org

One can simulate low-precision floating-point arithmetic via software by executing each
arithmetic operation in hardware and then rounding the result to the desired number of …

Save Cite Cited by 14 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Accurate calculation of Euclidean Norms using Double-word arithmetic

V Lefèvre, N Louvet, JM Muller, J Picot… - ACM Transactions on …, 2023 - dl.acm.org

We consider the computation of the Euclidean (or L2) norm of an n-dimensional vector in
floating-point arithmetic. We review the classical solutions used to avoid spurious overflow …

Save Cite Cited by 7 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Mixed precision randomized low-rank approximation with GPU tensor cores

M Baboulin, S Donfack, O Kaya, T Mary… - European Conference on …, 2024 - Springer

Randomized projection methods have been shown to be very efficient at computing low-
rank approximations (LRA) of large matrices. In this work, we investigate the design and …

Save Cite Cited by 2 Related articles All 14 versions Free GPT-4

LE-GEMM: A lightweight emulation-based GEMM with precision refinement on GPU

Y Zhang, L Lu, Z Yang, Z Liang, S Suo - Journal of Systems Architecture, 2025 - Elsevier

Many special hardware units, such as Matrix Core and Tensor Core, have recently been
designed and applied in various scientific computing scenarios. These units support tensor …

Save Cite Related articles

[Free GPT-4]

[PDF] arxiv.org

Susceptibility formulation of density matrix perturbation theory

A Niklasson, A Habib, JD Finkelstein… - The Journal of …, 2024 - pubs.aip.org

Density matrix perturbation theory based on recursive Fermi-operator expansions provides a
computationally efficient framework for time-independent response calculations in quantum …

[Free GPT-4]

[PDF] arxiv.org

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

M Croci, GN Wells - arxiv preprint arxiv:2410.12614, 2024 - arxiv.org

In this paper we develop the first fine-grained rounding error analysis of finite element (FE)
cell kernels and assembly. The theory includes mixed-precision implementations and …

[Free GPT-4]

[PDF] arxiv.org

Mixed-precision numerics in scientific applications: survey and perspectives

A Kashi, H Lu, W Brewer, D Rogers… - arxiv preprint arxiv …, 2024 - arxiv.org

The explosive demand for artificial intelligence (AI) workloads has led to a significant
increase in silicon area dedicated to lower-precision computations on recent high …

[Free GPT-4]

[PDF] ieee.org

Monotonicity of Multi-Term Floating-Point Adders

M Mikaitis - IEEE Transactions on Computers, 2024 - ieeexplore.ieee.org

In the literature on algorithms for computing multi-term addition in floating-point arithmetic it
is often shown that a hardware unit that has single normalization and rounding improves …

Save Cite Cited by 3 Related articles All 10 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Matrix multiplication in multiword arithmetic: Error analysis and application to GPU tensor cores

Mixed precision algorithms in numerical linear algebra

Efficient mixed-precision matrix factorization of the inverse overlap matrix in electronic structure calculations with AI-hardware and GPUs

CPFloat: AC library for simulating low-precision arithmetic

Accurate calculation of Euclidean Norms using Double-word arithmetic

Mixed precision randomized low-rank approximation with GPU tensor cores

LE-GEMM: A lightweight emulation-based GEMM with precision refinement on GPU

Susceptibility formulation of density matrix perturbation theory

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

Mixed-precision numerics in scientific applications: survey and perspectives

Monotonicity of Multi-Term Floating-Point Adders