Measuring multithreaded message matching misery

W Schonbein, MGF Dosanjh, RE Grant… - Euro-Par 2018: Parallel …, 2018 - Springer
MPI usage patterns are changing as applications move towards fully-multithreaded
runtimes. However, the impact of these patterns on MPI message matching is not well …

Improving MPI multi-threaded RMA communication performance

N Hjelm, MGF Dosanjh, RE Grant, T Groves… - Proceedings of the 47th …, 2018 - dl.acm.org
One-sided communication is crucial to enabling communication concurrency. As core counts
have increased, particularly with many-core architectures, one-sided (RMA) communication …

Towards millions of communicating threads

HV Dang, M Snir, W Gropp - Proceedings of the 23rd European MPI …, 2016 - dl.acm.org
Proceedings of the 23rd European MPI Users' Group Meeting: Towards millions of
communicating threads Page 1 Towards millions of communicating threads Hoang-Vu …

Callback-based completion notification using MPI Continuations

J Schuchart, P Samfass, C Niethammer, J Gracia… - Parallel Computing, 2021 - Elsevier
Asynchronous programming models (APM) are gaining more and more traction, allowing
applications to expose the available concurrency to a runtime system tasked with …

Optimizing computation-communication overlap in asynchronous task-based programs

E Castillo, N Jain, M Casas, M Moreto… - Proceedings of the …, 2019 - dl.acm.org
Asynchronous task-based programming models are gaining popularity to address the
programmability and performance challenges in high performance computing. One of the …

Cmb: a configurable messaging benchmark to explore fine-grained communication

WP Marts, DA Kruse, MGF Dosanjh… - 2024 IEEE 24th …, 2024 - ieeexplore.ieee.org
Modern communication APIs provide increased ability to specify when, where, and how to
send data between processes. One recent innovation is fine-grained communication, where …

From reactive to proactive load balancing for task‐based parallel applications in distributed memory machines

M Thanh Chung, J Weidendorfer… - Concurrency and …, 2023 - Wiley Online Library
Load balancing is often a challenge in task‐parallel applications. The balancing problems
are divided into static and dynamic.“Static” means that we have some prior knowledge about …

Fuzzy matching: Hardware accelerated mpi communication middleware

MGF Dosanjh, W Schonbein, RE Grant… - 2019 19th IEEE/ACM …, 2019 - ieeexplore.ieee.org
Contemporary parallel scientific codes often rely on message passing for inter-process
communication. However, inefficient coding practices or multithreading (eg, via …

Tail queues: a multi‐threaded matching architecture

MGF Dosanjh, RE Grant, W Schonbein… - Concurrency and …, 2020 - Wiley Online Library
As we approach exascale, computational parallelism will have to drastically increase in
order to meet throughput targets. Many‐core architectures have exacerbated this problem by …

Enabling tractable exploration of the performance of adaptive mesh refinement

CT Vaughan, RF Barrett - 2015 IEEE International Conference …, 2015 - ieeexplore.ieee.org
A broad range of physical phenomena in science and engineering can be explored using
finite difference and volume based application codes. Incorporating Adaptive Mesh …