Partitioned global address space languages
The Partitioned Global Address Space (PGAS) model is a parallel programming model that
aims to improve programmer productivity while at the same time aiming for high …
aims to improve programmer productivity while at the same time aiming for high …
Habanero-Java: the new adventures of old X10
In this paper, we present the Habanero-Java (HJ) language developed at Rice University as
an extension to the original Java-based definition of the X10 language. HJ includes a …
an extension to the original Java-based definition of the X10 language. HJ includes a …
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
GF Diamos, AR Kerr, S Yalamanchili… - Proceedings of the 19th …, 2010 - dl.acm.org
Ocelot is a dynamic compilation framework designed to map the explicitly data parallel
execution model used by NVIDIA CUDA applications onto diverse multithreaded platforms …
execution model used by NVIDIA CUDA applications onto diverse multithreaded platforms …
Slaw: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems
This poster introduces SLAW, a Scalable Locality-aware Adaptive Work-stealing scheduler.
The SLAW features an adaptive task scheduling algorithm combined with a locality-aware …
The SLAW features an adaptive task scheduling algorithm combined with a locality-aware …
Trends in data locality abstractions for HPC systems
The cost of data movement has always been an important concern in high performance
computing (HPC) systems. It has now become the dominant factor in terms of both energy …
computing (HPC) systems. It has now become the dominant factor in terms of both energy …
Exascale computing trends: Adjusting to the" new normal"'for computer architecture
We now have 20 years of data under our belt about the performance of supercomputers
against at least a single floating-point benchmark from dense linear algebra. Until about …
against at least a single floating-point benchmark from dense linear algebra. Until about …
Extreme heterogeneity 2018-productive computational science in the era of extreme heterogeneity: Report for DOE ASCR workshop on extreme heterogeneity
JS Vetter, R Brightwell, M Gokhale, P McCormick… - 2018 - osti.gov
The 2018 Basic Research Needs Workshop on Extreme Heterogeneity identified five Priority
Research Directions for realizing the capabilities needed to address the challenges posed …
Research Directions for realizing the capabilities needed to address the challenges posed …
[PDF][PDF] Hierarchical work stealing on manycore clusters
Abstract Partitioned Global Address Space languages like UPC offer a convenient way of
expressing large shared data structures, especially for irregular structures that require …
expressing large shared data structures, especially for irregular structures that require …
The locality descriptor: A holistic cross-layer abstraction to express data locality in GPUs
Exploiting data locality in GPUs is critical to making more efficient use of the existing caches
and the NUMA-based memory hierarchy expected in future GPUs. While modern GPU …
and the NUMA-based memory hierarchy expected in future GPUs. While modern GPU …
Hpvm: Heterogeneous parallel virtual machine
We propose a parallel program representation for heterogeneous systems, designed to
enable performance portability across a wide range of popular parallel hardware, including …
enable performance portability across a wide range of popular parallel hardware, including …