Parallel i/o evaluation techniques and emerging hpc workloads: A perspective
Emerging workloads such as artificial intelligence, big data analytics and complex multi-step
workflows alongside future exascale applications are anticipated future HPC workloads …
workflows alongside future exascale applications are anticipated future HPC workloads …
Revisiting I/O behavior in large-scale storage systems: The expected and the unexpected
Large-scale applications typically spend a large fraction of their execution time performing
I/O to a parallel storage system. However, with rapid progress in compute and storage …
I/O to a parallel storage system. However, with rapid progress in compute and storage …
Uncovering access, reuse, and sharing characteristics of {I/O-Intensive} files on {Large-Scale} production {HPC} systems
Large-scale high-performance computing (HPC) applications running on supercomputers
produce large amounts of data routinely and store it in files on multi-PB shared parallel …
produce large amounts of data routinely and store it in files on multi-PB shared parallel …
Parallel I/O Characterization and Optimization on Large-Scale HPC Systems: A 360-Degree Survey
H Ather, JL Bez, C Wang, H Childs, AD Malony… - ar** approach
G **an, W Yang, Y Tan, J Feng, Y Li, J Zhang, J Yu - Parallel Computing, 2024 - Elsevier
Users' limited understanding of the storage system architecture prevents them from fully
utilizing the parallel I/O capability of the storage system, leading to a negative impact on the …
utilizing the parallel I/O capability of the storage system, leading to a negative impact on the …
Footprinting parallel I/O–machine learning to classify application's I/O behavior
E Betke, J Kunkel - High Performance Computing: ISC High Performance …, 2019 - Springer
It is not uncommon to run tens of thousands of parallel jobs on large HPC systems. The
amount of data collected by monitoring systems on such systems is immense. Checking …
amount of data collected by monitoring systems on such systems is immense. Checking …
LASSi: metric based I/O analytics for HPC
LASSi is a tool aimed at analyzing application usage and contention caused by use of
shared resources (filesystem or network) in a HPC system. LASSi was initially developed to …
shared resources (filesystem or network) in a HPC system. LASSi was initially developed to …
Auto-tuning for HPC storage stack: an optimization perspective
Storage stack layers in high-performance computing (HPC) systems offer many tunable
parameters controlling I/O behaviors and underlying file system settings. The setting of these …
parameters controlling I/O behaviors and underlying file system settings. The setting of these …
The importance of temporal behavior when classifying job IO patterns using machine learning techniques
E Betke, J Kunkel - High Performance Computing: ISC High Performance …, 2020 - Springer
Every day, supercomputers execute 1000s of jobs with different characteristics. Data centers
monitor the behavior of jobs to support the users and improve the infrastructure, for instance …
monitor the behavior of jobs to support the users and improve the infrastructure, for instance …