- Academic Search

A Abdelfattah, H Anzt, EG Boman… - … Journal of High …, 2021 - journals.sagepub.com

The efficient utilization of mixed-precision numerical linear algebra algorithms can offer
attractive acceleration to scientific computing applications. Especially with the hardware …

Save Cite Cited by 200 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] cambridge.org

Mixed precision algorithms in numerical linear algebra

NJ Higham, T Mary - Acta Numerica, 2022 - cambridge.org

Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …

Save Cite Cited by 130 Related articles All 17 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques

JK Lee, L Mukhanov, AS Molahosseini… - ACM Computing …, 2023 - dl.acm.org

Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, and face recognition, among …

Save Cite Cited by 35 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] sagepub.com

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance

H Ootomo, R Yokota - The International Journal of High …, 2022 - journals.sagepub.com

Tensor Core is a mixed-precision matrix–matrix multiplication unit on NVIDIA GPUs with a
theoretical peak performance of more than 300 TFlop/s on Ampere architectures. Tensor …

Save Cite Cited by 43 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] sciencedirect.com

Toward performance-portable PETSc for GPU-based exascale systems

RT Mills, MF Adams, S Balay, J Brown, A Dener… - Parallel Computing, 2021 - Elsevier

Abstract The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers
scalable solvers for nonlinear time-dependent differential and algebraic equations and for …

Save Cite Cited by 71 Related articles All 13 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] royalsocietypublishing.org Full View

Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems

A Haidar, H Bayraktar, S Tomov… - … of the Royal …, 2020 - royalsocietypublishing.org

Double-precision floating-point arithmetic (FP64) has been the de facto standard for
engineering and scientific simulations for several decades. Problem complexity and the …

Save Cite Cited by 74 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] ssslab.cn

DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication

Y Lu, W Liu - Proceedings of the International Conference for High …, 2023 - dl.acm.org

Sparse matrix-vector multiplication (SpMV) plays a key role in computational science and
engineering, graph processing, and machine learning applications. Much work on SpMV …

Save Cite Cited by 10 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Mixed precision low-rank approximations and their application to block low-rank LU factorization

P Amestoy, O Boiteau, A Buttari… - IMA Journal of …, 2023 - academic.oup.com

We introduce a novel approach to exploit mixed precision arithmetic for low-rank
approximations. Our approach is based on the observation that singular vectors associated …

Save Cite Cited by 26 Related articles All 28 versions Free GPT-4

[Free GPT-4]

[PDF] siam.org

Sharper probabilistic backward error analysis for basic linear algebra kernels with random data

NJ Higham, T Mary - SIAM Journal on Scientific Computing, 2020 - SIAM

Standard backward error analyses for numerical linear algebra algorithms provide worst-
case bounds that can significantly overestimate the backward error. Our recent probabilistic …

Save Cite Cited by 46 Related articles All 15 versions Free GPT-4

[Free GPT-4]

[PDF] peerj.com

Numerical behavior of NVIDIA tensor cores

M Fasi, NJ Higham, M Mikaitis, S Pranesh - PeerJ Computer Science, 2021 - peerj.com

We explore the floating-point arithmetic implemented in the NVIDIA tensor cores, which are
hardware accelerators for mixed-precision matrix multiplication available on the Volta …

Save Cite Cited by 46 Related articles All 15 versions Free GPT-4 Cached

Create alert

Cite

Advanced search

Saved to My library

Mixed precision block fused multiply-add: Error analysis and application to GPU tensor cores

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

Mixed precision algorithms in numerical linear algebra

Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance

Toward performance-portable PETSc for GPU-based exascale systems

Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems

DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication

Mixed precision low-rank approximations and their application to block low-rank LU factorization

Sharper probabilistic backward error analysis for basic linear algebra kernels with random data

Numerical behavior of NVIDIA tensor cores