Recorder 2.0: Efficient parallel I/O tracing and analysis
Recorder is a multi-level I/O tracing tool that captures HDF5, MPI-I/O, and POSIX I/O calls. In
this paper, we present a new version of Recorder that adds support for most metadata …
this paper, we present a new version of Recorder that adds support for most metadata …
An empirical study of I/O separation for burst buffers in HPC systems
To meet the exascale I/O requirements for the High-Performance Computing (HPC), a new
I/O subsystem, Burst Buffer, based on solid state drives (SSD), has been developed …
I/O subsystem, Burst Buffer, based on solid state drives (SSD), has been developed …
{HadaFS}: A File System Bridging the Local and Shared Burst Buffer for Exascale Supercomputers
X He, B Yang, J Gao, W **ao, Q Chen, S Shi… - … USENIX Conference on …, 2023 - usenix.org
Current supercomputers introduce SSDs to form a Burst Buffer (BB) layer to meet the HPC
application's growing I/O requirements. BBs can be divided into two types by deployment …
application's growing I/O requirements. BBs can be divided into two types by deployment …
Optimizing the SSD burst buffer by traffic detection
Currently, HPC storage systems still use hard disk drive (HDD) as their dominant storage
device. Solid state drive (SSD) is widely deployed as the buffer to HDDs. Burst buffer has …
device. Solid state drive (SSD) is widely deployed as the buffer to HDDs. Burst buffer has …
PeakFS: An Ultra-High Performance Parallel File System via Computing-Network-Storage Co-Optimization for HPC Applications
Y Chen, H Yang, K Lu, W Huang… - … on Parallel and …, 2024 - ieeexplore.ieee.org
Emerging high-performance computing (HPC) applications with diverse workload
characteristics impose greater demands on parallel file systems (PFSs). PFSs also require …
characteristics impose greater demands on parallel file systems (PFSs). PFSs also require …
Pinpointing crash-consistency bugs in the HPC I/O stack: a cross-layer approach
We present ParaCrash, a testing framework for studying crash recovery in a typical HPC I/O
stack, and demonstrate its use by identifying 15 new crash-consistency bugs in various …
stack, and demonstrate its use by identifying 15 new crash-consistency bugs in various …
DeepFetch: A Node-Aware Greedy Fetch System for Distributed Cache of Deep Learning Applications
L Kong, F Mei, C Zhu, W Cheng… - … Architecture and Storage …, 2024 - ieeexplore.ieee.org
Data I/O poses a significant bottleneck for distributed deep learning applications. Utilizing
computing-node attached storage as a cache has become a prevalent solution to this …
computing-node attached storage as a cache has become a prevalent solution to this …
Design and Implementation of Burst Buffer Over-Subscription Scheme for HPC Storage Systems
Burst Buffer is widely used in supercomputer centers to bridge the performance gap
between computational power and the high-performance I/O systems. The primary role of …
between computational power and the high-performance I/O systems. The primary role of …
FINCHFS: Design of Ad-Hoc File System for I/O Heavy HPC Workloads
Although the performance improvements in parallel file systems have been significant, the
rise of data science and deep learning using Python has introduced new I/O requirements …
rise of data science and deep learning using Python has introduced new I/O requirements …
Understanding Highly Configurable Storage for Diverse Workloads
O Kogiou, H Devarajan, C Wang, W Yu… - … on Cluster Computing …, 2024 - ieeexplore.ieee.org
Highly configurable storage solutions such as VAST DataStore recently have emerged and
are now being deployed in many High Performance Computing facilities. However, these …
are now being deployed in many High Performance Computing facilities. However, these …