[KNIHA][B] ScaLAPACK users' guide

LS Blackford, J Choi, A Cleary, E D'Azevedo, J Demmel… - 1997 - SIAM
Following the initial release of LAPACK and the emerging importance of distributed memory
computing, work began on adapting LAPACK to distributed-memory architectures. Since …

The LINPACK benchmark: past, present and future

JJ Dongarra, P Luszczek… - … and Computation: practice …, 2003 - Wiley Online Library
This paper describes the LINPACK Benchmark and some of its variations commonly used to
assess the performance of computer systems. Aside from the LINPACK Benchmark suite, the …

Recent progress and prospects for integer factorisation algorithms

RP Brent - International Computing and Combinatorics …, 2000 - Springer
The integer factorisation and discrete logarithm problems are of practical importance
because of the widespread use of public key cryptosystems whose security depends on the …

A proposal for a set of parallel basic linear algebra subprograms

J Choi, J Dongarra, S Ostrouchov, A Petitet… - … Computations in Physics …, 1996 - Springer
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms
(PBLAS) for distributed memory MIMD computers. The PBLAS are targeted at distributed …

Algorithmic redistribution methods for block-cyclic decompositions

AP Petitet, JJ Dongarra - IEEE Transactions on Parallel and …, 1999 - ieeexplore.ieee.org
This article presents various data redistribution methods for block-partitioned linear algebra
algorithms operating on dense matrices that are distributed in a block-cyclic fashion …

[PDF][PDF] The LINPACK benchmark for the Fujitsu AP 1000

RP Brent - Proceedings of the Fourth Symposium on the Frontiers …, 1992 - Citeseer
We describe an implementation of the LINPACK Benchmark on the Fujitsu AP 1000. Design
considerations include communication primitives, data distribution, use of blocking to reduce …

[PDF][PDF] Efficient determination of block size NB for parallel LINPACK test

Z Wenli, F Jian**, C Mingyu - … and Systems (PDCS 2004), MIT, Received, 2004 - Citeseer
ABSTRACT HPL is a Linpack benchmark package widely used in massive cluster system
performance test. Based on indepth analysis of the blocked parallel solution algorithm of …

[PDF][PDF] Lapack working note 100 a proposal for a set of parallel basic linear algebra subprograms

J Choiy, J Dongarraz, S Ostrouchovx, A Petitetx… - University of Tennessee …, 1995 - irisa.fr
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms
PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix …

Matrix factorization using distributed panels on the Fujitsu AP1000

P Strazdins - … 1st International Conference on Algorithms and …, 1995 - ieeexplore.ieee.org
Dense linear algebra computations such as matrix factorization require the technique
of'block-partitioned algorithms' for their efficient implementation on memory-hierarchy …

[PDF][PDF] A high performance, portable distributed blas implementation

PE Strazdins - Fifth Parallel Computing Workshop for the Fujitsu …, 1996 - Citeseer
In this paper, we give a report on recent developments for the Distributed BLAS (DBLAS)
project. These include a powerful distributed matrix representation which yields a simple …