Affinity-based thread and data map** in shared memory systems

M Diener, EHM Cruz, MAZ Alves, POA Navaux… - ACM Computing …, 2016 - dl.acm.org
Shared memory architectures have recently experienced a large increase in thread-level
parallelism, leading to complex memory hierarchies with multiple cache memory levels and …

EagerMap: A task map** algorithm to improve communication and load balancing in clusters of multicore systems

EHM Cruz, M Diener, LL Pilla… - ACM Transactions on …, 2019 - dl.acm.org
Communication between tasks and load imbalance have been identified as a major
challenge for the performance and energy efficiency of parallel applications. A common way …

Kernel-based thread and data map** for improved memory affinity

M Diener, EHM Cruz, MAZ Alves… - … on Parallel and …, 2015 - ieeexplore.ieee.org
Reducing the cost of memory accesses, both in terms of performance and energy
consumption, is a major challenge in shared-memory architectures. Modern systems have …

Boosting graph analytics by tuning threads and data affinity on numa systems

HMGA Rocha, J Schwarzrock… - 2021 29th Euromicro …, 2021 - ieeexplore.ieee.org
The execution of large real-world graphs, such as web searches and social networks, has
been boosting by modern HPC systems. However, their irregular communication patterns …

Topology-aware job map**

Y Georgiou, E Jeannot, G Mercier… - … Journal of High …, 2018 - journals.sagepub.com
A Resource and Job Management System (RJMS) is a crucial system software part of the
HPC stack. It is responsible for efficiently delivering computing power to applications in …

Effective exploration of thread throttling and thread/page map** on numa systems

J Schwarzrock, HMGA Rocha… - 2020 IEEE 22nd …, 2020 - ieeexplore.ieee.org
NUMA systems have become commonly used in HPC. However, to fully take advantage of
these systems, the right thread-to-core allocation and page placement are essential. On top …

Hardware-assisted thread and data map** in hierarchical multicore architectures

EHM Cruz, M Diener, LL Pilla… - ACM Transactions on …, 2016 - dl.acm.org
The performance and energy efficiency of modern architectures depend on memory locality,
which can be improved by thread and data map**s considering the memory access …

Topology-aware resource management for HPC applications

Y Georgiou, E Jeannot, G Mercier… - Proceedings of the 18th …, 2017 - dl.acm.org
The Resource and Job Management System (RJMS) is a crucial system software part of the
HPC stack. It is responsible for efficiently delivering computing power to applications in …

Locality and balance for communication-aware thread map** in multicore systems

M Diener, EHM Cruz, MAZ Alves, MS Alhakeem… - Euro-Par 2015: Parallel …, 2015 - Springer
In multicore architectures, deciding where to execute the threads of parallel applications is
increasingly a significant challenge. This thread map** has a large impact on the …

A scalable and adaptable ILP-based approach for task map** on MPSoC considering load balance and communication optimization

K Huang, X Zhang, D Zheng, M Yu… - … on Computer-Aided …, 2018 - ieeexplore.ieee.org
Task map** has been a hot topic in multiprocessor system-on-chip software design for
decades. During the map** process, load balance (LB) and communication optimization …