The LINPACK benchmark: past, present and future

JJ Dongarra, P Luszczek… - … and Computation: practice …, 2003 - Wiley Online Library
This paper describes the LINPACK Benchmark and some of its variations commonly used to
assess the performance of computer systems. Aside from the LINPACK Benchmark suite, the …

[BOOK][B] Patterns for parallel programming

TG Mattson, B Sanders, B Massingill - 2004 - books.google.com
The Parallel Programming Guide for Every Software Developer From grids and clusters to
next-generation game consoles, parallel computing is going mainstream. Innovations such …

Matrix algebra

JE Gentle - Springer texts in statistics, Springer, New York, NY, doi, 2007 - Springer
Vectors and matrices are useful in representing multivariate numeric data, and they occur
naturally in working with linear equations or when expressing linear relationships among …

BLIS: A framework for rapidly instantiating BLAS functionality

FG Van Zee, RA Van De Geijn - ACM Transactions on Mathematical …, 2015 - dl.acm.org
The BLAS-like Library Instantiation Software (BLIS) framework is a new infrastructure for
rapidly instantiating Basic Linear Algebra Subprograms (BLAS) functionality. Its fundamental …

Elemental: A new framework for distributed memory dense matrix computations

J Poulson, B Marker, RA Van de Geijn… - ACM Transactions on …, 2013 - dl.acm.org
Parallelizing dense matrix computations to distributed memory architectures is a well-
studied subject and generally considered to be among the best understood domains of …

Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization

TA Davis - ACM Transactions on Mathematical Software (TOMS), 2011 - dl.acm.org
SuiteSparseQR is a sparse QR factorization package based on the multifrontal method.
Within each frontal matrix, LAPACK and the multithreaded BLAS enable the method to …

[BOOK][B] Parallel scientific computation: a structured approach using BSP and MPI

RH Bisseling - 2004 - books.google.com
This is the first text explaining how to use the bulk synchronous parallel (BSP) model and the
freely available BSPlib communication library in parallel algorithm design and parallel …

FLAME: Formal linear algebra methods environment

JA Gunnels, FG Gustavson, GM Henry… - ACM Transactions on …, 2001 - dl.acm.org
Since the advent of high-performance distributed-memory parallel computing, the need for
intelligible code has become ever greater. The development and maintenance of libraries …

Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations

T Auckenthaler, V Blum, HJ Bungartz, T Huckle… - Parallel Computing, 2011 - Elsevier
The computation of selected eigenvalues and eigenvectors of a symmetric (Hermitian)
matrix is an important subtask in many contexts, for example in electronic structure …

SLICOT—A subroutine library in systems and control theory

P Benner, V Mehrmann, V Sima, S Van Huffel… - … Control, Signals, and …, 1999 - Springer
This chapter describes the subroutine library SLICOT that provides Fortran 77
implementations of numerical algorithms for computations in systems and control theory …