Scaling the power wall: a path to exascale

O Villa, DR Johnson, M Oconnor… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org
Modern scientific discovery is driven by an insatiable demand for computing performance.
The HPC community is targeting development of supercomputers able to sustain 1 ExaFlops …

The ops domain specific abstraction for multi-block structured grid computations

IZ Reguly, GR Mudalige, MB Giles… - … on Domain-Specific …, 2014 - ieeexplore.ieee.org
Code maintainability, performance portability and future proofing are some of the key
challenges in this era of rapid change in High Performance Computing. Domain Specific …

MiniApps derived from production HPC applications using multiple programing models

OEB Messer, E D'Azevedo, J Hill… - … Journal of High …, 2018 - journals.sagepub.com
We have developed a set of reduced, proxy applications (“MiniApps”) based on large-scale
application codes supported at the Oak Ridge Leadership Computing Facility (OLCF). The …

Calculating architectural vulnerability factors for spatial multi-bit transient faults

M Wilkening, V Sridharan, S Li… - 2014 47th Annual …, 2014 - ieeexplore.ieee.org
Reliability is an important design constraint in modern microprocessors, and one of the
fundamental reliability challenges is combating the effects of transient faults. This requires …

CARAT: A case for virtual memory through compiler-and runtime-based address translation

B Suchy, S Campanoni, N Hardavellas… - Proceedings of the 41st …, 2020 - dl.acm.org
Virtual memory is a critical abstraction in modern computer systems. Its common model,
paging, is currently seeing considerable innovation, yet its implementations continue to be …

A CUDA implementation of the High Performance Conjugate Gradient benchmark

E Phillips, M Fatica - … Modeling, Benchmarking and Simulation of High …, 2014 - Springer
Abstract The High Performance Conjugate Gradient (HPCG) benchmark has been recently
proposed as a complement to the High Performance Linpack (HPL) benchmark currently …

Micro-applications for communication data access patterns and MPI datatypes

T Schneider, R Gerstenberger, T Hoefler - Recent Advances in the …, 2012 - Springer
Data is often communicated from different locations in application memory and is commonly
serialized (copied) to send buffers or from receive buffers. MPI datatypes are a way to avoid …

Skope: A framework for modeling and exploring workload behavior

J Meng, X Wu, V Morozov, V Vishwanath… - Proceedings of the 11th …, 2014 - dl.acm.org
Understanding workload behavior plays an important role in performance studies. The
growing complexity of applications and architectures has increased the gap among …

Energy evaluation for applications with different thread affinities on the Intel Xeon Phi

G Lawson, M Sosonkina, Y Shen - … International Symposium on …, 2014 - ieeexplore.ieee.org
The Intel Xeon Phi coprocessor offers high parallelism on energy-efficient hardware to
minimize energy consumption while maintaining performance. Dynamic frequency and …

Beyond 16GB: out-of-core stencil computations

IZ Reguly, GR Mudalige, MB Giles - … of the Workshop on Memory Centric …, 2017 - dl.acm.org
Stencil computations are a key class of applications, widely used in the scientific computing
community, and a class that has particularly benefited from performance improvements on …