Optimization of a multilevel checkpoint model with uncertain execution scales

S Di, L Bautista-Gome… - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
Future extreme-scale systems are expected to experience different types of failures affecting
applications with different failure scales, from transient uncorrectable memory errors in …

[PDF][PDF] Topology-aware job scheduling strategies for torus networks

J Enos, G Bauer, R Brunner, S Islam… - Proceedings of the …, 2014 - pdfs.semanticscholar.org
●# PBS–l nodeset= ONEOF: FEATURE: s1_24x6x24, s2_24x6x24,…● Job will run in first
available requested feature● Good run time consistency for PSDNS in 24x8x24 nodeset● …

Early experiences scaling VMD molecular visualization and analysis jobs on Blue Waters

JE Stone, B Isralewitz, K Schulten - 2013 Extreme Scaling …, 2013 - ieeexplore.ieee.org
Pataskala molecular dynamics simulations provide a powerful tool for probing the dynamics
of cellular processes at atomic and nanosecond resolution not achievable by experimental …

Breaking and fixing the self encryption scheme for data security in mobile devices

P Gasti, Y Chen - 2010 18th Euromicro Conference on Parallel …, 2010 - ieeexplore.ieee.org
Data security is one of the major challenges that prevents the wider acceptance of mobile
devices, especially within business and government environments. It is non-trivial to protect …

[書籍][B] Failure avoidance techniques for HPC systems based on failure prediction

A Gainaru - 2015 - search.proquest.com
A increasingly larger percentage of computing capacity in today's large high-performance
computing systems is wasted due to failures and recoveries. Moreover, it is expected that …

[PDF][PDF] Expanding Blue Waters with improved acceleration capability

CL Mendes, GH Bauer, WT Kramer… - Proceedings of the …, 2014 - researchgate.net
Blue Waters, the first open-science supercomputer to achieve a sustained rate of one
petaflop/s on a broad mix of scientific applications, is the largest system ever built by Cray. It …

[PDF][PDF] A classification of parallel I/O toward demystifying HPC I/O best practices

R Sisneros - Proceedings of Cray User Group Meeting (CUG-2016), 2016 - cug.org
The process of optimizing parallel I/O can quite easily become daunting. By the nature of its
implementation there are many highly sensitive, tunable parameters and a subtle change to …

Resiliency of high-performance computing systems: A fault-injection-based characterization of the high-speed network in the blue waters testbed

SS Tang - 2018 - ideals.illinois.edu
Supercomputers have played an essential role in the progress of science and engineering
research. As the high-performance computing (HPC) community moves towards the next …

[PDF][PDF] How Deep is Your I/O? Toward Practical Large-Scale I/O Optimization via Machine Learning Methods

R Sisneros, JJ Kim, M Raji, K Chadalavada - cug.org
Performance-related diagnostic data routinely collected by administrators of HPC machines
is an excellent target for the application of machine learning approaches. There is a clear …