Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

S Usman, R Mehmood, I Katib, A Albeshri - Electronics, 2022 - mdpi.com
Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …

DAG Scheduling in the BSP Model

PA Papp, G Anegg, AN Yzelman - arxiv preprint arxiv:2303.05989, 2023 - arxiv.org
We study the problem of scheduling an arbitrary computational DAG on a fixed number of
processors while minimizing the makespan. While previous works have mostly studied this …

[PDF][PDF] A C++ GraphBLAS: specification, implementation, parallelisation, and evaluation

AN Yzelman, D Di Nardo, JM Nash, WJ Suijlen - Preprint, 2020 - albert-jan.yzelman.net
The GraphBLAS is a programming model that expresses graph algorithms in linear
algebraic terms. It takes an easy-to-use, data-centric view where algebraic operations …

Large bandwidth-efficient FFTs on multicore and multi-socket systems

DT Popovici, TM Low… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Current microprocessor trends show a steady increase in the number of cores and/or
threads present on the same CPU die. While this increase improves performance for …

Efficient Multi-Processor Scheduling in Increasingly Realistic Models

PA Papp, G Anegg, A Karanasiou… - Proceedings of the 36th …, 2024 - dl.acm.org
We study the problem of efficiently scheduling a computational DAG on multiple processors.
The majority of previous works have developed and compared algorithms for this problem in …

Efficient Multi-Processor Scheduling in Increasingly Realistic Models (Brief Summary)

PA Papp, G Anegg, A Karanasiou… - 2024 IEEE …, 2024 - ieeexplore.ieee.org
We study the problem of efficiently scheduling a computational DAG on multiple processors.
While previous works have mostly studied this problem in rather simple models, we instead …

Co-Approximator: Enabling Performance Prediction in Colocated Applications.

R Mohammad, S Gopalakrishnan… - ACM Transactions on …, 2024 - dl.acm.org
Today's Internet of Things (IoT) devices can colocate multiple applications on a platform with
hardware resource sharing. Such colocations allow for increasing the throughput of …

[BOOK][B] Parallel Scientific Computation: A Structured Approach Using BSP

RH Bisseling - 2020 - books.google.com
Building upon the wide-ranging success of the first edition, Parallel Scientific Computation
presents a single unified approach to using a range of parallel computers, from a small …

Performance prediction of parallel computing models to analyze cloud-based big data applications

C Shen, W Tong, KKR Choo, S Kausar - Cluster Computing, 2018 - Springer
Performance evaluation of cloud center is a necessary prerequisite to fulfilling contractual
quality of service, particularly in big data applications. However, effectively evaluating …

A parallel algorithm for GAC filtering of the alldifferent constraint

W Suijlen, F de Framond, A Lallouet… - … Conference on Integration …, 2022 - Springer
In constraint programming the Alldifferent constraint is one of the oldest and most used
global constraints. The algorithm by Régin enforces generalized arc-consistency, which is …