A large-scale study of MPI usage in open-source HPC applications

I Laguna, R Marshall, K Mohror, M Ruefenacht… - Proceedings of the …, 2019 - dl.acm.org
Understanding the state-of-the-practice in MPI usage is paramount for many aspects of
supercomputing, including optimizing the communication of HPC applications and informing …

Dag-based workflows scheduling using actor–critic deep reinforcement learning

GP Koslovski, K Pereira, PR Albuquerque - Future Generation Computer …, 2024 - Elsevier
Abstract High-Performance Computing (HPC) is essential to support the advance in multiple
research and industrial fields. Despite the recent growth in processing and networking …

Cxl memory as persistent memory for disaggregated hpc: A practical approach

Y Fridman, S Mutalik Desai, N Singh… - Proceedings of the SC' …, 2023 - dl.acm.org
In the landscape of High-Performance Computing (HPC), the quest for efficient and scalable
memory solutions remains paramount. The advent of Compute Express Link (CXL) …

Finepoints: Partitioned multithreaded MPI communication

RE Grant, MGF Dosanjh, MJ Levenhagen… - … Conference, ISC High …, 2019 - Springer
The MPI multithreading model has been historically difficult to optimize; the interface that it
provides for threads was designed as a process-level interface. This model has led to …

A performance analysis of modern parallel programming models using a compute-bound application

A Poenaru, WC Lin, S McIntosh-Smith - International Conference on High …, 2021 - Springer
Performance portability is becoming more-and-more important as next-generation high
performance computing systems grow increasingly diverse and heterogeneous. Several …

[HTML][HTML] A survey on malleability solutions for high-performance distributed computing

JI Aliaga, M Castillo, S Iserte, I Martín-Álvarez… - Applied Sciences, 2022 - mdpi.com
Maintaining a high rate of productivity, in terms of completed jobs per unit of time, in High-
Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale …

Implementation and evaluation of MPI 4.0 partitioned communication libraries

MGF Dosanjh, A Worley, D Schafer, P Soundararajan… - Parallel Computing, 2021 - Elsevier
Partitioned point-to-point communication primitives provide a performance-oriented
mechanism to support a hybrid parallel programming model and have been included in the …

Understanding the use of message passing interface in exascale proxy applications

N Sultana, M Rüfenacht, A Skjellum… - Concurrency and …, 2021 - Wiley Online Library
Summary The Exascale Computing Project (ECP) focuses on the development of future
exascale‐capable applications. Most ECP applications use the message passing interface …

Reconfigurable switches for high performance and flexible MPI collectives

P Haghi, A Guo, Q **ong, C Yang… - Concurrency and …, 2022 - Wiley Online Library
There has been much effort in offloading MPI collective operations into hardware. But while
NIC‐based collective acceleration is well‐studied, offloading their processing into the …

FPGAs in the network and novel communicator support accelerate MPI collectives

P Haghi, A Guo, Q **ong, R Patel… - 2020 IEEE High …, 2020 - ieeexplore.ieee.org
MPI collective operations can often be performance killers in HPC applications; we seek to
solve this bottleneck by offloading them to reconfigurable hardware within the switch itself …