Bandwidth-optimal all-to-all exchanges in fat tree networks

B Prisacari, G Rodriguez, C Minkenberg… - Proceedings of the 27th …, 2013 - dl.acm.org
The personalized all-to-all collective exchange is one of the most challenging
communication patterns in HPC applications in terms of performance and scalability. In the …

Composable, non-blocking collective operations on power7 ih

GI Tanase, G Almasi, H Xue, C Archer - Proceedings of the 26th ACM …, 2012 - dl.acm.org
The Power7 IH (P7IH) is one of IBM's latest generation of supercomputers. Like most
modern parallel machines, it has a hierarchical organization consisting of simultaneous …

Monitoramento de Desempenho usando Dados de Proveniência e de Domínio durante a Execução de Aplicações Científicas

R Souza, V Silva, L Neves, D de Oliveira… - … em Desempenho de …, 2015 - sol.sbc.org.br
Simulações computacionais, em geral, são compostas pelo encadeamento de aplicações
científicas e executadas em ambientes de processamento de alto desempenho. Tais …

[PDF][PDF] L'UNIVERSITÉ BORDEAUX

C DE PERRE - 2009 - academia.edu
Un très grand merci aussi à mes directrices de thèse pour m 'avoir encouragée à faire cette
thèse, pour m 'avoir encadrée afin que j 'arrive au bout de ces 3 ans avec des résultats à …

[PDF][PDF] Hierarchical additions to the SPMD programming model

AA Kamil, KA Yelick - 2012 - researchgate.net
Large-scale parallel machines are programmed mainly with the single program, multiple
data (SPMD) model of parallelism. This model has advantages of scalability and simplicity …

Static Analysis and Dynamic Adaptation of Parallelism.

P Huchant - 2019 - inria.hal.science
Scientific applications have an increasing need of resources and many grand scientific
challenges require exascale compute capabilities to be addressed. One major concern to …

[KNJIGA][B] Single program, multiple data programming for hierarchical computations

AA Kamil - 2012 - search.proquest.com
As performance gains in sequential programming have stagnated due to power constraints,
parallel computing has become the primary tool for increasing performance. Parallel …

Improving the Hybrid model MPI+ Threads through Applications, Runtimes and Performance tools

A Maheo - 2015 - theses.hal.science
To provide increasing computational power for numerical simulations, supercomputers
evolved and arenow more and more complex to program. Indeed, after the appearance of …

OpenSHMEM sets and groups: An approach to worksharing and memory management

F Aderholdt, S Pophale, M Gorentla Venkata… - … . OpenSHMEM in the …, 2019 - Springer
Collective operations in the OpenSHMEM programming model are defined over an active
set, which is a grou** of (PEs) based on a triple of information including the starting PE, a …

[PDF][PDF] Understanding the formation of wait states in one-sided communication

MA Hermanns - 2018 - publications.rwth-aachen.de
Due to the available concurrency in modern-day supercomputers, the complexity of
develo** efficient parallel applications for these platforms has grown rapidly in the last …