Transitioning from file-based HPC workflows to streaming data pipelines with openPMD and ADIOS2

F Poeschel, JE, WF Godoy, N Podhorszki… - Smoky Mountains …, 2021 - Springer
This paper aims to create a transition path from file-based IO to streaming-based workflows
for scientific applications in an HPC environment. By using the openPMP-api, traditional …

Scalable training of graph convolutional neural networks for fast and accurate predictions of homo-lumo gap in molecules

JY Choi, P Zhang, K Mehta, A Blanchard… - Journal of …, 2022 - Springer
Abstract Graph Convolutional Neural Network (GCNN) is a popular class of deep learning
(DL) models in material science to predict material properties from the graph representation …

Modeling of advanced accelerator concepts

JL Vay, A Huebl, R Lehe, NM Cook… - Journal of …, 2021 - iopscience.iop.org
Computer modeling is essential to research on Advanced Accelerator Concepts (AAC), as
well as to their design and operation. This paper summarizes the current status and future …

An algorithmic and software pipeline for very large scale scientific data compression with error guarantees

T Banerjee, J Choi, J Lee, Q Gong… - 2022 IEEE 29th …, 2022 - ieeexplore.ieee.org
Efficient data compression is becoming increasingly critical for storing scientific data
because many scientific applications produce vast amounts of data. This paper presents an …

Snowmass21 accelerator modeling community white paper

S Biedron, L Brouwer, DL Bruhwiler, NM Cook… - arxiv preprint arxiv …, 2022 - arxiv.org
After a summary of relevant comments and recommendations from various reports over the
last ten years, this paper examines the modeling needs in accelerator physics, from the …

Olsync: Object-level tiering and coordination in tiered storage systems based on software-defined network

Z Li, Y Wang, S Nie, J Wang, C Zhang, F Yu… - Future Generation …, 2025 - Elsevier
With the adoption of new storage technologies like NVMs, tiered storage has gained
popularity in large-scale, hyper-converged clusters. The storage back-end of hyper …

A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing

Y Song, T Wu, Y Li, G Li, Y Liu, S Yin, W Xue… - Proceedings of the 2024 …, 2024 - dl.acm.org
Acquiring data from scientific simulations for analytical purposes is inherently challenging
due to the complex and irregularly shaped regions within which the data resides, particularly …

HDF5 in the exascale era: Delivering efficient and scalable parallel I/O for exascale applications

M Scot Breitenfeld, H Tang, H Zheng… - … Journal of High …, 2025 - journals.sagepub.com
Accurately modeling real-world systems requires scientific applications at exascale to
generate massive amounts of data and manage data storage efficiently. However, parallel …

The Artificial Scientist--in-transit Machine Learning of Plasma Simulations

J Kelling, V Bolea, M Bussmann… - arxiv preprint arxiv …, 2025 - arxiv.org
Increasing HPC cluster sizes and large-scale simulations that produce petabytes of data per
run, create massive IO and storage challenges for analysis. Deep learning-based …

I/O Behind the Scenes: Bandwidth Requirements of HPC Applications with Asynchronous I/O

A Tarraf, JF Muñoz, DE Singh, T Ozden… - 2024 IEEE …, 2024 - ieeexplore.ieee.org
I/O bandwidth is a critical resource in an HPC cluster. As with all shared resources, its
availability is impacted significantly by the users and the applications they execute. Without …