[BUCH][B] Structured parallel programming: patterns for efficient computation
M McCool, J Reinders, A Robison - 2012 - books.google.com
Structured Parallel Programming offers the simplest way for developers to learn patterns for
high-performance parallel programming. Written by parallel computing experts and industry …
high-performance parallel programming. Written by parallel computing experts and industry …
Towards dense linear algebra for hybrid GPU accelerated manycore systems
We highlight the trends leading to the increased appeal of using hybrid multicore+ GPU
systems for high performance computing. We present a set of techniques that can be used to …
systems for high performance computing. We present a set of techniques that can be used to …
Communication-optimal parallel and sequential QR and LU factorizations
We present parallel and sequential dense QR factorization algorithms that are both optimal
(up to polylogarithmic factors) in the amount of communication they perform and just as …
(up to polylogarithmic factors) in the amount of communication they perform and just as …
Communication-optimal parallel 2.5 D matrix multiplication and LU factorization algorithms
Extra memory allows parallel matrix multiplication to be done with asymptotically less
communication than Cannon's algorithm and be faster in practice.“3D” algorithms arrange …
communication than Cannon's algorithm and be faster in practice.“3D” algorithms arrange …
[BUCH][B] Communication-avoiding Krylov subspace methods
M Hoemmen - 2010 - search.proquest.com
Krylov subspace methods (KSMs) are iterative algorithms for solving large, sparse linear
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
Gaussian elimination
NJ Higham - Wiley Interdisciplinary Reviews: Computational …, 2011 - Wiley Online Library
As the standard method for solving systems of linear equations, Gaussian elimination (GE) is
one of the most important and ubiquitous numerical algorithms. However, its successful use …
one of the most important and ubiquitous numerical algorithms. However, its successful use …
A survey of recent developments in parallel implementations of Gaussian elimination
Gaussian elimination is a canonical linear algebra procedure for solving linear systems of
equations. In the last few years, the algorithm has received a lot of attention in an attempt to …
equations. In the last few years, the algorithm has received a lot of attention in an attempt to …
Graph expansion and communication costs of fast matrix multiplication
The communication cost of algorithms (also known as I/O-complexity) is shown to be closely
related to the expansion properties of the corresponding computation graphs. We …
related to the expansion properties of the corresponding computation graphs. We …
Communication avoiding rank revealing QR factorization with column pivoting
In this paper we introduce CARRQR, a communication avoiding rank revealing QR
factorization with tournament pivoting. We show that CARRQR reveals the numerical rank of …
factorization with tournament pivoting. We show that CARRQR reveals the numerical rank of …
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous
multicore and multi-GPU systems to support dense matrix computations efficiently. The main …
multicore and multi-GPU systems to support dense matrix computations efficiently. The main …