I/o access patterns in hpc applications: A 360-degree survey

JL Bez, S Byna, S Ibrahim - ACM Computing Surveys, 2023 - dl.acm.org
The high-performance computing I/O stack has been complex due to multiple software
layers, the inter-dependencies among these layers, and the different performance tuning …

Parallel i/o evaluation techniques and emerging hpc workloads: A perspective

S Neuwirth, AK Paul - 2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
Emerging workloads such as artificial intelligence, big data analytics and complex multi-step
workflows alongside future exascale applications are anticipated future HPC workloads …

Revisiting I/O behavior in large-scale storage systems: The expected and the unexpected

T Patel, S Byna, GK Lockwood, D Tiwari - Proceedings of the …, 2019 - dl.acm.org
Large-scale applications typically spend a large fraction of their execution time performing
I/O to a parallel storage system. However, with rapid progress in compute and storage …

End-to-end I/O monitoring on leading supercomputers

B Yang, W Xue, T Zhang, S Liu, X Ma, X Wang… - ACM Transactions on …, 2023 - dl.acm.org
This paper offers a solution to overcome the complexities of production system I/O
performance monitoring. We present Beacon, an end-to-end I/O resource monitoring and …

Understanding hpc application i/o behavior using system level statistics

AK Paul, O Faaland, A Moody… - 2020 IEEE 27th …, 2020 - ieeexplore.ieee.org
The processor performance of high performance computing (HPC) systems is increasing at
a much higher rate than storage performance. This imbalance leads to I/O performance …

[PDF][PDF] Iominer: Large-scale analytics framework for gaining knowledge from i/o logs

T Wang, S Snyder, G Lockwood, P Carns… - 2018 IEEE International …, 2018 - sdm.lbl.gov
Modern HPC systems are collecting large amounts of I/O performance data. The massive
volume and heterogeneity of this data, however, have made timely performance of in-depth …

An integrated indexing and search service for distributed file systems

H Sim, A Khan, SS Vazhkudai, SH Lim… - … on Parallel and …, 2020 - ieeexplore.ieee.org
Data services such as search, discovery, and management in scalable distributed
environments have traditionally been decoupled from the underlying file systems, and are …

Tagit: an integrated indexing and search service for file systems

H Sim, Y Kim, SS Vazhkudai, GR Vallée… - Proceedings of the …, 2017 - dl.acm.org
Data services such as search, discovery, and management in scalable distributed
environments have traditionally been decoupled from the underlying file systems, and are …

Full lifecycle data analysis on a large-scale and leadership supercomputer: what can we learn from it?

B Yang, H Wei, W Zhu, Y Zhang, W Liu… - 2024 USENIX Annual …, 2024 - usenix.org
The system architecture of contemporary supercomputers is growing increasingly intricate
with the ongoing evolution of system-wide network and storage technologies, making it …

Comprehensive measurement and analysis of the user-perceived i/o performance in a production leadership-class storage system

L Wan, M Wolf, F Wang, JY Choi… - 2017 IEEE 37th …, 2017 - ieeexplore.ieee.org
With the increase of the scale and intensity of the parallel I/O workloads generated by those
scientific applications running on high performance computing facilities, understanding the …