Adapt: An event-based adaptive collective communication framework

X Luo, W Wu, G Bosilca, T Patinyasakdikul… - Proceedings of the 27th …, 2018 - dl.acm.org
The increase in scale and heterogeneity of high-performance computing (HPC) systems
predispose the performance of Message Passing Interface (MPI) collective communications …

Hardware performance variation: A comparative study using lightweight kernels

H Weisbach, B Gerofi, B Kocoloski, H Härtig… - … Conference, ISC High …, 2018 - Springer
Imbalance among components of large scale parallel simulations can adversely affect
overall application performance. Software induced imbalance has been extensively studied …

Using simulation to examine the effect of MPI message matching costs on application performance

S Levy, KB Ferreira - Proceedings of the 25th European MPI Users' …, 2018 - dl.acm.org
Attaining high performance with MPI applications requires efficient message matching to
minimize message processing overheads and the latency these overheads introduce into …

MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns

MS Beni, B Cosenza, S Hunold - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
The Message Passing Interface (MPI) is a programming model for develo** high-
performance applications on large-scale machines. A key component of MPI is its collective …

Transforming blocking MPI collectives to non-blocking and persistent operations

H Ahmed, A Skjellumh, P Bangalore… - Proceedings of the 24th …, 2017 - dl.acm.org
This paper describes Petal, a prototype tool that uses compiler-analysis techniques to
automate code transformations to hide communication costs behind computation by …

Jitter-trace: A low-overhead OS noise tracing tool based on linux perf

NM Gonzalez, A Morari, F Checconi - Proceedings of the 7th …, 2017 - dl.acm.org
Operating System (OS) noise is a well-known phenomenon in which OS activities interfere
with the execution of large-scale parallel applications. Due to OS noise, feature-rich software …

Progressive load balancing of asynchronous algorithms

J Zarins, M Weiland - Proceedings of the Seventh Workshop on Irregular …, 2017 - dl.acm.org
Synchronisation in the presence of noise and hardware performance variability is a key
challenge that prevents applications from scaling to large problems and machines. Using …

[HTML][HTML] Communication-hiding pipelined BiCGSafe methods for solving large linear systems

VQH Huynh, H Suito - Applied Mathematics and Computation, 2023 - Elsevier
Recently, a new variant of the BiCGStab method, known as the pipelined BiCGStab, has
been proposed. This method can achieve a higher degree of scalability and speed-up rates …

The unexpected virtue of almost: Exploiting MPI collective operations to approximately coordinate checkpoints

S Levy, KB Ferreira, P Widener - … and Computation: Practice …, 2020 - Wiley Online Library
Coordinated checkpoint/restart is currently the dominant approach to mitigating the impact of
failures on important scientific applications running on large‐scale distributed systems …