Scaling the power wall: a path to exascale
Modern scientific discovery is driven by an insatiable demand for computing performance.
The HPC community is targeting development of supercomputers able to sustain 1 ExaFlops …
The HPC community is targeting development of supercomputers able to sustain 1 ExaFlops …
The ops domain specific abstraction for multi-block structured grid computations
Code maintainability, performance portability and future proofing are some of the key
challenges in this era of rapid change in High Performance Computing. Domain Specific …
challenges in this era of rapid change in High Performance Computing. Domain Specific …
MiniApps derived from production HPC applications using multiple programing models
OEB Messer, E D'Azevedo, J Hill… - … Journal of High …, 2018 - journals.sagepub.com
We have developed a set of reduced, proxy applications (“MiniApps”) based on large-scale
application codes supported at the Oak Ridge Leadership Computing Facility (OLCF). The …
application codes supported at the Oak Ridge Leadership Computing Facility (OLCF). The …
Calculating architectural vulnerability factors for spatial multi-bit transient faults
Reliability is an important design constraint in modern microprocessors, and one of the
fundamental reliability challenges is combating the effects of transient faults. This requires …
fundamental reliability challenges is combating the effects of transient faults. This requires …
CARAT: A case for virtual memory through compiler-and runtime-based address translation
Virtual memory is a critical abstraction in modern computer systems. Its common model,
paging, is currently seeing considerable innovation, yet its implementations continue to be …
paging, is currently seeing considerable innovation, yet its implementations continue to be …
A CUDA implementation of the High Performance Conjugate Gradient benchmark
E Phillips, M Fatica - … Modeling, Benchmarking and Simulation of High …, 2014 - Springer
Abstract The High Performance Conjugate Gradient (HPCG) benchmark has been recently
proposed as a complement to the High Performance Linpack (HPL) benchmark currently …
proposed as a complement to the High Performance Linpack (HPL) benchmark currently …
Micro-applications for communication data access patterns and MPI datatypes
Data is often communicated from different locations in application memory and is commonly
serialized (copied) to send buffers or from receive buffers. MPI datatypes are a way to avoid …
serialized (copied) to send buffers or from receive buffers. MPI datatypes are a way to avoid …
Skope: A framework for modeling and exploring workload behavior
Understanding workload behavior plays an important role in performance studies. The
growing complexity of applications and architectures has increased the gap among …
growing complexity of applications and architectures has increased the gap among …
Energy evaluation for applications with different thread affinities on the Intel Xeon Phi
G Lawson, M Sosonkina, Y Shen - … International Symposium on …, 2014 - ieeexplore.ieee.org
The Intel Xeon Phi coprocessor offers high parallelism on energy-efficient hardware to
minimize energy consumption while maintaining performance. Dynamic frequency and …
minimize energy consumption while maintaining performance. Dynamic frequency and …
Beyond 16GB: out-of-core stencil computations
Stencil computations are a key class of applications, widely used in the scientific computing
community, and a class that has particularly benefited from performance improvements on …
community, and a class that has particularly benefited from performance improvements on …