Revisiting I/O behavior in large-scale storage systems: The expected and the unexpected
Large-scale applications typically spend a large fraction of their execution time performing
I/O to a parallel storage system. However, with rapid progress in compute and storage …
I/O to a parallel storage system. However, with rapid progress in compute and storage …
A year in the life of a parallel file system
I/O performance is a critical aspect of data-intensive scientific computing. We seek to
advance the state of the practice in understanding and diagnosing I/O performance issues …
advance the state of the practice in understanding and diagnosing I/O performance issues …
Systematically inferring I/O performance variability by examining repetitive job behavior
Monitoring and analyzing I/O behaviors is critical to the efficient utilization of parallel storage
systems. Unfortunately, with increasing I/O requirements and resource contention, I/O …
systems. Unfortunately, with increasing I/O requirements and resource contention, I/O …
Uncovering access, reuse, and sharing characteristics of {I/O-Intensive} files on {Large-Scale} production {HPC} systems
Large-scale high-performance computing (HPC) applications running on supercomputers
produce large amounts of data routinely and store it in files on multi-PB shared parallel …
produce large amounts of data routinely and store it in files on multi-PB shared parallel …
Real-time I/O-monitoring of HPC applications with SIOX, elasticsearch, Grafana and FUSE
E Betke, J Kunkel - High Performance Computing: ISC High Performance …, 2017 - Springer
The starting point for our work was a demand for an overview of application's I/O behavior,
that provides information about the usage of our HPC “Mistral”. We suspect that some …
that provides information about the usage of our HPC “Mistral”. We suspect that some …
UMAMI: a recipe for generating meaningful metrics through holistic I/O performance analysis
I/O efficiency is essential to productivity in scientific computing, especially as many scientific
domains become more data-intensive. Many characterization tools have been used to …
domains become more data-intensive. Many characterization tools have been used to …
A comprehensive i/o knowledge cycle for modular and automated hpc workload analysis
On the way to the exascale era, millions of parallel processing elements are required.
Accordingly, one major chal-lenge is the ever-widening gap between computational power …
Accordingly, one major chal-lenge is the ever-widening gap between computational power …
Ai-coupled hpc workflows
Increasingly, scientific discovery requires sophisticated and scalable workflows. Workflows
have become the``new applications,''wherein multi-scale computing campaigns comprise …
have become the``new applications,''wherein multi-scale computing campaigns comprise …
Improving collective i/o performance with machine learning supported auto-tuning
A Bağbaba - 2020 IEEE International Parallel and Distributed …, 2020 - ieeexplore.ieee.org
Collective Input and output (I/O) is an essential approach in high performance computing
(HPC) applications. The achievement of effective collective I/O is a nontrivial job due to the …
(HPC) applications. The achievement of effective collective I/O is a nontrivial job due to the …
Tools for analyzing parallel I/O
Parallel application I/O performance often does not meet user expectations. Additionally,
slight access pattern modifications may lead to significant changes in performance due to …
slight access pattern modifications may lead to significant changes in performance due to …