A survey on software methods to improve the energy efficiency of parallel computing
Energy consumption is one of the top challenges for achieving the next generation of
supercomputing. Codesign of hardware and software is critical for improving energy …
supercomputing. Codesign of hardware and software is critical for improving energy …
Mille-feuille: A tile-grained mixed precision single-kernel conjugate gradient solver on gpus
D Yang, Y Zhao, Y Niu, W Jia, E Shao… - … Conference for High …, 2024 - ieeexplore.ieee.org
Conjugate gradient (CG) and biconjugate gradient stabilized (BiCGSTAB) are effective
methods used for solving sparse linear systems. We in this paper propose Mille-feuille, a …
methods used for solving sparse linear systems. We in this paper propose Mille-feuille, a …
Collective Mind: Towards Practical and Collaborative Auto‐Tuning
Empirical auto‐tuning and machine learning techniques have been showing high potential
to improve execution time, power consumption, code size, reliability and other important …
to improve execution time, power consumption, code size, reliability and other important …
Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs
Graphics processing units (GPUs) brought huge performance improvements in the scientific
and numerical fields. We present an efficient hybrid CPU/GPU approach that is portable …
and numerical fields. We present an efficient hybrid CPU/GPU approach that is portable …
[PDF][PDF] Prospectus for the Next LAPACK and ScaLAPACK Libraries: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration …
The convergence of several unprecedented changes, including formidable new system
design constraints and revolutionary levels of heterogeneity, has made it clear that much of …
design constraints and revolutionary levels of heterogeneity, has made it clear that much of …
Solving incompressible Navier-Stokes equations on heterogeneous parallel architectures
Y Wang - 2015 - theses.hal.science
In this PhD thesis, we present our research in the domain of high performance software for
computational fluid dynamics (CFD). With the increasing demand of high-resolution …
computational fluid dynamics (CFD). With the increasing demand of high-resolution …
[PDF][PDF] Direct and Iterative Methods for Linear Systems
G Meurant - 2023 - gerard-meurant.fr
Solving linear systems of equations is ubiquitous in scientific computing. Therefore,
numerical algorithms for solving them are of paramount importance. There are two main …
numerical algorithms for solving them are of paramount importance. There are two main …
Accelerated Gauss-Huard Algorithm on Hybrid GPU-CPU: Look-Ahead with the Delayed Algorithm Approach
In this paper, we tackle a significant bottleneck-the panel factorization step-in the Gauss-
Huard algorithm through a novel parallel computing approach. We address the open …
Huard algorithm through a novel parallel computing approach. We address the open …
Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition
DJ Magee, KE Niemeyer - Journal of Computational Physics, 2018 - Elsevier
The expedient design of precision components in aerospace and other high-tech industries
requires simulations of physical phenomena often described by partial differential equations …
requires simulations of physical phenomena often described by partial differential equations …
Event-triggered communication in parallel computing
Communication overhead in parallel systems can be a significant bottleneck in scaling up
parallel computation. In this paper, we propose event-triggered communication methods to …
parallel computation. In this paper, we propose event-triggered communication methods to …