Communication lower bounds and optimal algorithms for numerical linear algebra
The traditional metric for the efficiency of a numerical algorithm has been the number of
arithmetic operations it performs. Technological trends have long been reducing the time to …
arithmetic operations it performs. Technological trends have long been reducing the time to …
The sparse polyhedral framework: Composing compiler-generated inspector-executor code
Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …
simulations, and finite element analysis have performance problems due to their use of …
Communication-optimal parallel and sequential QR and LU factorizations
We present parallel and sequential dense QR factorization algorithms that are both optimal
(up to polylogarithmic factors) in the amount of communication they perform and just as …
(up to polylogarithmic factors) in the amount of communication they perform and just as …
Reducing communication in graph neural network training
Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the
naturally sparse connectivity information of the data. GNNs represent this connectivity as …
naturally sparse connectivity information of the data. GNNs represent this connectivity as …
[BOG][B] Communication-avoiding Krylov subspace methods
M Hoemmen - 2010 - search.proquest.com
Krylov subspace methods (KSMs) are iterative algorithms for solving large, sparse linear
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
Tiled QR factorization algorithms
This work revisits existing algorithms for the QR factorization of rectangular matrices
composed of p× q tiles, where p≥ q. Within this framework, we study the critical paths and …
composed of p× q tiles, where p≥ q. Within this framework, we study the critical paths and …
[BOG][B] Communication-avoiding Krylov subspace methods in theory and practice
EC Carson - 2015 - search.proquest.com
Advancements in the field of high-performance scientific computing are necessary to
address the most important challenges we face in the 21st century. From physical modeling …
address the most important challenges we face in the 21st century. From physical modeling …
Communication-avoiding QR decomposition for GPUs
We describe an implementation of the Communication-Avoiding QR (CAQR) factorization
that runs entirely on a single graphics processor (GPU). We show that the reduction in …
that runs entirely on a single graphics processor (GPU). We show that the reduction in …
Parallel algorithms for tensor train arithmetic
We present efficient and scalable parallel algorithms for performing mathematical operations
for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for …
for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for …
Shifted Cholesky QR for computing the QR factorization of ill-conditioned matrices
The Cholesky QR algorithm is an efficient communication-minimizing algorithm for
computing the QR factorization of a tall-skinny matrix X∈R^m*n, where m≫n. Unfortunately …
computing the QR factorization of a tall-skinny matrix X∈R^m*n, where m≫n. Unfortunately …