StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
In the field of HPC, the current hardware trend is to design multiprocessor architectures that
feature heterogeneous technologies such as specialized coprocessors (eg Cell/BE SPUs) or …
feature heterogeneous technologies such as specialized coprocessors (eg Cell/BE SPUs) or …
Ompss: a proposal for programming heterogeneous multi-core architectures
In this paper, we present OmpSs, a programming model based on OpenMP and StarSs, that
can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on …
can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on …
A dependency-aware task-based programming environment for multi-core architectures
Parallel programming on SMP and multi-core architectures is hard. In this paper we present
a programming model for those environments based on automatic function level parallelism …
a programming model for those environments based on automatic function level parallelism …
Productive programming of GPU clusters with OmpSs
Clusters of GPUs are emerging as a new computational scenario. Programming them
requires the use of hybrid models that increase the complexity of the applications, reducing …
requires the use of hybrid models that increase the complexity of the applications, reducing …
Criticality-aware dynamic task scheduling for heterogeneous architectures
Current and future parallel programming models need to be portable and efficient when
moving to heterogeneous multi-core systems. OmpSs is a task-based programming model …
moving to heterogeneous multi-core systems. OmpSs is a task-based programming model …
Task scheduling techniques for asymmetric multi-core systems
As performance and energy efficiency have become the main challenges for next-
generation high-performance computing, asymmetric multi-core architectures can provide …
generation high-performance computing, asymmetric multi-core architectures can provide …
Productive cluster programming with OmpSs
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI.
But, the productivity of MPI programmers is low because of the complexity of expressing …
But, the productivity of MPI programmers is low because of the complexity of expressing …
Using a" codelet" program execution model for exascale machines: position paper
S Zuckerman, J Suetterlein, R Knauerhase… - Proceedings of the 1st …, 2011 - dl.acm.org
As computing has moved relentlessly through giga-, tera-, and peta-scale systems, exa-
scale (a million trillion operations/sec.) computing is currently under active research. DARPA …
scale (a million trillion operations/sec.) computing is currently under active research. DARPA …
An algorithm for the optimal control of the driving of trains
R Franke, P Terwiesch, M Meyer - Proceedings of the 39th IEEE …, 2000 - ieeexplore.ieee.org
We discuss an algorithm that optimizes the driving style of a train. The objective is to
minimize the electrical energy used for traction subject to constraints such as the travel time …
minimize the electrical energy used for traction subject to constraints such as the travel time …
Scheduling dense linear algebra operations on multicore processors
State‐of‐the‐art dense linear algebra software, such as the LAPACK and ScaLAPACK
libraries, suffers performance losses on multicore processors due to their inability to fully …
libraries, suffers performance losses on multicore processors due to their inability to fully …