A survey on software methods to improve the energy efficiency of parallel computing

C **, BR de Supinski, D Abramson… - … Journal of High …, 2017 - journals.sagepub.com
Energy consumption is one of the top challenges for achieving the next generation of
supercomputing. Codesign of hardware and software is critical for improving energy …

Mille-feuille: A tile-grained mixed precision single-kernel conjugate gradient solver on gpus

D Yang, Y Zhao, Y Niu, W Jia, E Shao… - … Conference for High …, 2024 - ieeexplore.ieee.org
Conjugate gradient (CG) and biconjugate gradient stabilized (BiCGSTAB) are effective
methods used for solving sparse linear systems. We in this paper propose Mille-feuille, a …

Collective Mind: Towards Practical and Collaborative Auto‐Tuning

G Fursin, R Miceli, A Lokhmotov, M Gerndt… - Scientific …, 2014 - Wiley Online Library
Empirical auto‐tuning and machine learning techniques have been showing high potential
to improve execution time, power consumption, code size, reliability and other important …

Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs

S Donfack, S Tomov, J Dongarra - 2014 IEEE International …, 2014 - ieeexplore.ieee.org
Graphics processing units (GPUs) brought huge performance improvements in the scientific
and numerical fields. We present an efficient hybrid CPU/GPU approach that is portable …

[PDF][PDF] Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration …

J Demmel, J Dongarra, J Langou, J Langou… - 2020 - stat.berkeley.edu
The convergence of several unprecedented changes, including formidable new system
design constraints and revolutionary levels of heterogeneity, has made it clear that much of …

Solving incompressible Navier-Stokes equations on heterogeneous parallel architectures

Y Wang - 2015 - theses.hal.science
In this PhD thesis, we present our research in the domain of high performance software for
computational fluid dynamics (CFD). With the increasing demand of high-resolution …

[PDF][PDF] Direct and Iterative Methods for Linear Systems

G Meurant - 2023 - gerard-meurant.fr
Solving linear systems of equations is ubiquitous in scientific computing. Therefore,
numerical algorithms for solving them are of paramount importance. There are two main …

Accelerated Gauss-Huard Algorithm on Hybrid GPU-CPU: Look-Ahead with the Delayed Algorithm Approach

HG Elzayyadi, WS Sayed, MAEL Naggar… - 2023 International …, 2023 - ieeexplore.ieee.org
In this paper, we tackle a significant bottleneck-the panel factorization step-in the Gauss-
Huard algorithm through a novel parallel computing approach. We address the open …

Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition

DJ Magee, KE Niemeyer - Journal of Computational Physics, 2018 - Elsevier
The expedient design of precision components in aerospace and other high-tech industries
requires simulations of physical phenomena often described by partial differential equations …

Event-triggered communication in parallel computing

S Ghosh, KK Saha, V Gupta… - 2018 IEEE/ACM 9th …, 2018 - ieeexplore.ieee.org
Communication overhead in parallel systems can be a significant bottleneck in scaling up
parallel computation. In this paper, we propose event-triggered communication methods to …