Tile low rank Cholesky factorization for climate/weather modeling applications on manycore architectures
Covariance matrices are ubiquitous in computational science and engineering. In particular,
large covariance matrices arise from multivariate spatial data sets, for instance, in …
large covariance matrices arise from multivariate spatial data sets, for instance, in …
PLASMA: Parallel linear algebra software for multicore using OpenMP
The recent version of the Parallel Linear Algebra Software for Multicore Architectures
(PLASMA) library is based on tasks with dependencies from the OpenMP standard. The …
(PLASMA) library is based on tasks with dependencies from the OpenMP standard. The …
[PDF][PDF] On runtime systems for task-based programming on heterogeneous platforms
S Thibault - 2018 - inria.hal.science
SIMULATION has become pervasive in science. Real experimentation remains an essential
step in scientific research, but simulation replaced a wide range of costly and lengthy or …
step in scientific research, but simulation replaced a wide range of costly and lengthy or …
The impact of taskyield on the design of tasks communicating through MPI
The OpenMP tasking directives promise to help expose a higher degree of concurrency to
the runtime than traditional worksharing constructs, which is especially useful for irregular …
the runtime than traditional worksharing constructs, which is especially useful for irregular …
Distributed-memory lattice -matrix factorization
We parallelize the LU factorization of a hierarchical low-rank matrix (H-matrix) on a
distributed-memory computer. This is much more difficult than the H-matrix-vector …
distributed-memory computer. This is much more difficult than the H-matrix-vector …
Portable and efficient dense linear algebra in the beginning of the exascale era
The SLATE project is implementing a distributed dense linear algebra library for highly-
scalable distributed-memory accelerator-based computer systems. The goal is to provide a …
scalable distributed-memory accelerator-based computer systems. The goal is to provide a …
Nonrelativistic energy levels of HD
Nonadiabatic exponential functions are employed to solve the four-body Schrödinger
equation. Nonrelativistic bound energy levels of the HD molecule are calculated to the …
equation. Nonrelativistic bound energy levels of the HD molecule are calculated to the …
A New Sparse Solver Using A Posteriori Threshold Pivoting
The factorization of sparse symmetric indefinite systems is particularly challenging since
pivoting is required to maintain stability of the factorization. Pivoting techniques generally …
pivoting is required to maintain stability of the factorization. Pivoting techniques generally …
Performance and energy analysis of openmp runtime systems with dense linear algebra algorithms
In this article, we analyze performance and energy consumption of five OpenMP runtime
systems over a non-uniform memory access (NUMA) platform. We also selected three CPU …
systems over a non-uniform memory access (NUMA) platform. We also selected three CPU …
Stop talking to me--a communication-avoiding ADER-DG realisation
DE Charrier, T Weinzierl - arxiv preprint arxiv:1801.08682, 2018 - arxiv.org
We present a communication-and data-sensitive formulation of ADER-DG for hyperbolic
differential equation systems. Sensitive here has multiple flavours: First, the formulation …
differential equation systems. Sensitive here has multiple flavours: First, the formulation …