Tile low rank Cholesky factorization for climate/weather modeling applications on manycore architectures

K Akbudak, H Ltaief, A Mikhalev, D Keyes - International Conference on …, 2017 - Springer
Covariance matrices are ubiquitous in computational science and engineering. In particular,
large covariance matrices arise from multivariate spatial data sets, for instance, in …

PLASMA: Parallel linear algebra software for multicore using OpenMP

J Dongarra, M Gates, A Haidar, J Kurzak… - ACM Transactions on …, 2019 - dl.acm.org
The recent version of the Parallel Linear Algebra Software for Multicore Architectures
(PLASMA) library is based on tasks with dependencies from the OpenMP standard. The …

[PDF][PDF] On runtime systems for task-based programming on heterogeneous platforms

S Thibault - 2018 - inria.hal.science
SIMULATION has become pervasive in science. Real experimentation remains an essential
step in scientific research, but simulation replaced a wide range of costly and lengthy or …

The impact of taskyield on the design of tasks communicating through MPI

J Schuchart, K Tsugane, J Gracia, M Sato - Evolving OpenMP for Evolving …, 2018 - Springer
The OpenMP tasking directives promise to help expose a higher degree of concurrency to
the runtime than traditional worksharing constructs, which is especially useful for irregular …

Distributed-memory lattice -matrix factorization

I Yamazaki, A Ida, R Yokota… - … International Journal of …, 2019 - journals.sagepub.com
We parallelize the LU factorization of a hierarchical low-rank matrix (H-matrix) on a
distributed-memory computer. This is much more difficult than the H-matrix-vector …

Portable and efficient dense linear algebra in the beginning of the exascale era

M Gates, A YarKhan, D Sukkari… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
The SLATE project is implementing a distributed dense linear algebra library for highly-
scalable distributed-memory accelerator-based computer systems. The goal is to provide a …

Nonrelativistic energy levels of HD

K Pachucki, J Komasa - Physical Chemistry Chemical Physics, 2018 - pubs.rsc.org
Nonadiabatic exponential functions are employed to solve the four-body Schrödinger
equation. Nonrelativistic bound energy levels of the HD molecule are calculated to the …

A New Sparse Solver Using A Posteriori Threshold Pivoting

I Duff, J Hogg, F Lopez - SIAM Journal on Scientific Computing, 2020 - SIAM
The factorization of sparse symmetric indefinite systems is particularly challenging since
pivoting is required to maintain stability of the factorization. Pivoting techniques generally …

Performance and energy analysis of openmp runtime systems with dense linear algebra algorithms

JV Ferreira Lima, I Raïs, L Lefevre… - … International Journal of …, 2019 - journals.sagepub.com
In this article, we analyze performance and energy consumption of five OpenMP runtime
systems over a non-uniform memory access (NUMA) platform. We also selected three CPU …

Stop talking to me--a communication-avoiding ADER-DG realisation

DE Charrier, T Weinzierl - arxiv preprint arxiv:1801.08682, 2018 - arxiv.org
We present a communication-and data-sensitive formulation of ADER-DG for hyperbolic
differential equation systems. Sensitive here has multiple flavours: First, the formulation …