A survey of MPI usage in the US exascale computing project

DE Bernholdt, S Boehm, G Bosilca… - Concurrency and …, 2020 - Wiley Online Library
Summary The Exascale Computing Project (ECP) is currently the primary effort in the United
States focused on develo** “exascale” levels of computing capabilities, including …

Finepoints: Partitioned multithreaded MPI communication

RE Grant, MGF Dosanjh, MJ Levenhagen… - … Conference, ISC High …, 2019 - Springer
The MPI multithreading model has been historically difficult to optimize; the interface that it
provides for threads was designed as a process-level interface. This model has led to …

MPIX Stream: An explicit solution to hybrid MPI+ X programming

H Zhou, K Raffenetti, Y Guo, R Thakur - … of the 29th European MPI Users' …, 2022 - dl.acm.org
The hybrid MPI+ X programming paradigm, where X refers to threads or GPUs, has gained
prominence in the high-performance computing arena. This corresponds to a trend of …

Implementation and evaluation of MPI 4.0 partitioned communication libraries

MGF Dosanjh, A Worley, D Schafer, P Soundararajan… - Parallel Computing, 2021 - Elsevier
Partitioned point-to-point communication primitives provide a performance-oriented
mechanism to support a hybrid parallel programming model and have been included in the …

Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints

S Sridharan, J Dinan… - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
Modern high-speed interconnection networks are designed with capabilities to support
communication from multiple processor cores. The MPI endpoints extension has been …

Give MPI threading a fair chance: A study of multithreaded MPI designs

T Patinyasakdikul, D Eberius… - … Conference on Cluster …, 2019 - ieeexplore.ieee.org
The Message Passing Interface (MPI) has been one of the most prominent programming
paradigms in high-performance computing (HPC) for the past decade. Lately, with changes …

Exascale machines require new programming paradigms and runtimes

G Da Costa, T Fahringer, JAR Gallego… - Supercomputing …, 2015 - superfri.org
Extreme scale parallel computing systems will have tens of thousands of optionally
accelerator-equiped nodes with hundreds of cores each, as well as deep memory …

Improving MPI multi-threaded RMA communication performance

N Hjelm, MGF Dosanjh, RE Grant, T Groves… - Proceedings of the 47th …, 2018 - dl.acm.org
One-sided communication is crucial to enabling communication concurrency. As core counts
have increased, particularly with many-core architectures, one-sided (RMA) communication …

How I learned to stop worrying about user-visible endpoints and love MPI

R Zambre, A Chandramowliswharan… - Proceedings of the 34th …, 2020 - dl.acm.org
MPI+ threads is gaining prominence as an alternative to the traditional" MPI everywhere"
model in order to better handle the disproportionate increase in the number of cores …

Partitioned collective communication

DJ Holmes, A Skjellum, J Jaeger… - 2021 Workshop on …, 2021 - ieeexplore.ieee.org
Partitioned point-to-point communication and persistent collective communication were both
recently standardized in MPI-4.0. Each offers performance and scalability advantages over …