Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints

S Sridharan, J Dinan… - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
Modern high-speed interconnection networks are designed with capabilities to support
communication from multiple processor cores. The MPI endpoints extension has been …

Multi-level load balancing with an integrated runtime approach

S Bak, H Menon, S White, M Diener… - 2018 18th IEEE/ACM …, 2018 - ieeexplore.ieee.org
The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-
node parallelism. These high core counts result in hardware variability that introduces …

Enhancing MPI+ OpenMP task based applications for heterogeneous architectures with GPU Support

M Ferat, R Pereira, A Roussel, P Carribault… - … Workshop on OpenMP, 2022 - Springer
Heterogeneous supercomputers are widespread over HPC systems and programming
efficient applications on these architectures is a challenge. Task-based programming …

Introducing kernel-level page reuse for high performance computing

S Valat, M Pérache, W Jalby - … of the ACM SIGPLAN Workshop on …, 2013 - dl.acm.org
Due to computer architecture evolution, more and more HPC applications have to include
thread-based parallelism and take care of memory consumption. Such evolutions require …

Towards achieving transparent malleability thanks to mpi process virtualization

H Taboada, R Pereira, J Jaeger, JB Besnard - International Conference on …, 2023 - Springer
Abstract The field of High-Performance Computing is rapidly evolving, driven by the race for
computing power and the emergence of new architectures. Despite these changes, the …

A Distributed Version of Syrup

G Audemard, JM Lagniez, N Szczepanski… - … Conference on Theory …, 2017 - Springer
A portfolio SAT solver has to share clauses in order to be efficient. In a distributed
environment, such sharing implies additional problems: more information has to be …

Introducing task-containers as an alternative to runtime-stacking

JB Besnard, J Adam, S Shende, M Pérache… - Proceedings of the 23rd …, 2016 - dl.acm.org
The advent of many-core architectures poses new challenges to the MPI programming
model which has been designed for distributed memory message passing. It is now clear …

Thread-local storage extension to support thread-based MPI/openMP applications

P Carribault, M Pérache, H Jourdren - International Workshop on OpenMP, 2011 - Springer
With the advent of the multicore era, the architecture of supercomputers in HPC (High-
Performance Computing) is evolving to integrate larger computational nodes with an …

Hybrid parallel programming models for AMR neutron Monte-Carlo transport

D Dureau, G Poëtte - … MC 2013-Joint …, 2014 - sna-and-mc-2013-proceedings …
This paper deals with High Performance Computing (HPC) applied to neutron transport
theory on complex geometries, thanks to both an Adaptive Mesh Refinement (AMR) …

A methodology for assessing computation/communication overlap of MPI nonblocking collectives

A Denis, J Jaeger, E Jeannot… - … : Practice and Experience, 2022 - Wiley Online Library
By allowing computation/communication overlap, MPI nonblocking collectives (NBC) are
supposed to improve application scalability and performance. However, it is known that to …