Tiled QR factorization algorithms

H Bouwmeester, M Jacquelin, J Langou… - Proceedings of 2011 …, 2011 - dl.acm.org
This work revisits existing algorithms for the QR factorization of rectangular matrices
composed of p× q tiles, where p≥ q. Within this framework, we study the critical paths and …

A hybridization methodology for high-performance linear algebra software for GPUs

E Agullo, C Augonnet, J Dongarra, H Ltaief… - GPU Computing Gems …, 2012 - Elsevier
Publisher Summary This chapter presents a hybridization methodology for the development
of high-performance linear algebra software for graphics processing units (GPUs). The …

QR factorization on a multicore node enhanced with multiple GPU accelerators

E Agullo, C Augonnet, J Dongarra… - … Parallel & Distributed …, 2011 - ieeexplore.ieee.org
One of the major trends in the design of exascale architectures is the use of multicore nodes
enhanced with GPU accelerators. Exploiting all resources of a hybrid accelerators-based …

Federated principal component analysis for genome-wide association studies

A Hartebrodt, R Nasirigerdeh… - … Conference on Data …, 2021 - ieeexplore.ieee.org
Federated learning (FL) has emerged as a privacy-aware alternative to centralized data
analysis, especially for biomedical analyses such as genome-wide association studies …

Federated singular value decomposition for high-dimensional data

A Hartebrodt, R Röttger, DB Blumenthal - Data Mining and Knowledge …, 2024 - Springer
Federated learning (FL) is emerging as a privacy-aware alternative to classical cloud-based
machine learning. In FL, the sensitive data remains in data silos and only aggregated …

Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architectures

J Dongarra, H Ltaief, P Luszczek… - … Conference on Cloud …, 2012 - ieeexplore.ieee.org
We propose to study the impact on the energy footprint of two advanced algorithmic
strategies in the context of high performance dense linear algebra libraries:(1) mixed …

Fine-Grained Multithreading for the Multifrontal Factorization of Sparse Matrices

A Buttari - SIAM Journal on Scientific Computing, 2013 - SIAM
The advent of multicore processors represents a disruptive event in the history of computer
science as conventional parallel programming paradigms are proving incapable of fully …

Hierarchical QR factorization algorithms for multi-core clusters

J Dongarra, M Faverge, T Herault, M Jacquelin… - Parallel Computing, 2013 - Elsevier
This paper describes a new QR factorization algorithm which is especially designed for
massively parallel platforms combining parallel distributed nodes, where a node is a multi …

Toward an evolutionary task parallel integrated MPI+ X programming model

RF Barrett, DT Stark, CT Vaughan, RE Grant… - Proceedings of the …, 2015 - dl.acm.org
The Bulk Synchronous Parallel programming model is showing performance limitations at
high processor counts. We propose over-decomposition of the domain, operated on as …

Scalable tile communication-avoiding QR factorization on multicore cluster systems

F Song, H Ltaief, B Hadri… - SC'10: Proceedings of the …, 2010 - ieeexplore.ieee.org
As tile linear algebra algorithms continue achieving high performance on shared-memory
multicore architectures, it is a challenging task to make them scalable on distributed-memory …