I/o access patterns in hpc applications: A 360-degree survey

JL Bez, S Byna, S Ibrahim - ACM Computing Surveys, 2023 - dl.acm.org
The high-performance computing I/O stack has been complex due to multiple software
layers, the inter-dependencies among these layers, and the different performance tuning …

Parallel i/o evaluation techniques and emerging hpc workloads: A perspective

S Neuwirth, AK Paul - 2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
Emerging workloads such as artificial intelligence, big data analytics and complex multi-step
workflows alongside future exascale applications are anticipated future HPC workloads …

DFTracer: An Analysis-Friendly Data Flow Tracer for AI-Driven Workflows

H Devarajan, L Pottier, K Velusamy… - … Conference for High …, 2024 - ieeexplore.ieee.org
Modern HPC workflows involve intricate coupling of simulation, data analytics, and artificial
intelligence (AI) applications to improve time to scientific insight. These workflows require a …

[PDF][PDF] Iominer: Large-scale analytics framework for gaining knowledge from i/o logs

T Wang, S Snyder, G Lockwood, P Carns… - 2018 IEEE International …, 2018 - sdm.lbl.gov
Modern HPC systems are collecting large amounts of I/O performance data. The massive
volume and heterogeneity of this data, however, have made timely performance of in-depth …

tf-Darshan: Understanding fine-grained I/O performance in machine learning workloads

SWD Chien, A Podobas, IB Peng… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Machine Learning applications on HPC systems have been gaining popularity in recent
years. The upcoming large scale systems will offer tremendous parallelism for training …

A comprehensive i/o knowledge cycle for modular and automated hpc workload analysis

Z Zhu, S Neuwirth, T Lippert - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
On the way to the exascale era, millions of parallel processing elements are required.
Accordingly, one major chal-lenge is the ever-widening gap between computational power …

Toward understanding I/O behavior in HPC workflows

J Lüttgau, S Snyder, P Carns… - 2018 IEEE/ACM 3rd …, 2018 - ieeexplore.ieee.org
Scientific discovery increasingly depends on complex workflows consisting of multiple
phases and sometimes millions of parallelizable tasks or pipelines. These workflows access …

{GIFT}: A coupon based {Throttle-and-Reward} mechanism for fair and efficient {I/O} bandwidth management on parallel storage systems

T Patel, R Garg, D Tiwari - 18th USENIX Conference on File and Storage …, 2020 - usenix.org
Large-scale parallel applications are highly data-intensive and perform terabytes of I/O
routinely. Unfortunately, on a large-scale system where multiple applications run …

I/o bottleneck detection and tuning: Connecting the dots using interactive log analysis

JL Bez, H Tang, B **e… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
Using parallel file systems efficiently is a tricky problem due to inter-dependencies among
multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO …

A benchmark suite and performance analysis of user-space provenance collectors

S Grayson, F Aguilar, R Milewicz, DS Katz… - Proceedings of the 2nd …, 2024 - dl.acm.org
Computational provenance has many important applications, especially to reproducibility.
System-level provenance collectors can track provenance data without requiring the user to …