Preconditioners for Krylov subspace methods: An overview

JW Pearson, J Pestana - GAMM‐Mitteilungen, 2020 - Wiley Online Library
When simulating a mechanism from science or engineering, or an industrial process, one is
frequently required to construct a mathematical model, and then resolve this model …

Communication lower bounds and optimal algorithms for numerical linear algebra

G Ballard, E Carson, J Demmel, M Hoemmen… - Acta Numerica, 2014 - cambridge.org
The traditional metric for the efficiency of a numerical algorithm has been the number of
arithmetic operations it performs. Technological trends have long been reducing the time to …

[BOG][B] Krylov subspace methods: principles and analysis

J Liesen, Z Strakos - 2013 - books.google.com
The mathematical theory of Krylov subspace methods with a focus on solving systems of
linear algebraic equations is given a detailed treatment in this principles-based book …

Communication-optimal parallel and sequential QR and LU factorizations

J Demmel, L Grigori, M Hoemmen, J Langou - SIAM Journal on Scientific …, 2012 - SIAM
We present parallel and sequential dense QR factorization algorithms that are both optimal
(up to polylogarithmic factors) in the amount of communication they perform and just as …

Fast stencil-code computation on a wafer-scale processor

K Rocki, D Van Essendelft, I Sharapov… - … Conference for High …, 2020 - ieeexplore.ieee.org
The performance of CPU-based and GPU-based systems is often low for PDE codes, where
large, sparse, and often structured systems of linear equations must be solved. Iterative …

A heuristic clustering-based task deployment approach for load balancing using Bayes theorem in cloud environment

J Zhao, K Yang, X Wei, Y Ding, L Hu… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
Aiming at the current problems that most physical hosts in the cloud data center are so
overloaded that it makes the whole cloud data center'load imbalanced and that existing load …

AmgX: A library for GPU accelerated algebraic multigrid and preconditioned iterative methods

M Naumov, M Arsaev, P Castonguay, J Cohen… - SIAM Journal on …, 2015 - SIAM
The solution of large sparse linear systems arises in many applications, such as
computational fluid dynamics and oil reservoir simulation. In realistic cases the matrices are …

Hiding global synchronization latency in the preconditioned conjugate gradient algorithm

P Ghysels, W Vanroose - Parallel Computing, 2014 - Elsevier
Scalability of Krylov subspace methods suffers from costly global synchronization steps that
arise in dot-products and norm calculations on parallel machines. In this work, a modified …

Dark memory and accelerator-rich system optimization in the dark silicon era

A Pedram, S Richardson, M Horowitz… - IEEE Design & …, 2016 - ieeexplore.ieee.org
Unlike traditional dark silicon works that attack the computing logic, this article puts a focus
on the memory part, which dissipates most of the energy for memory-bound CPU …

Block Gram-Schmidt algorithms and their stability properties

E Carson, K Lund, M Rozložník, S Thomas - Linear Algebra and its …, 2022 - Elsevier
Abstract Block Gram-Schmidt algorithms serve as essential kernels in many scientific
computing applications, but for many commonly used variants, a rigorous treatment of their …