I/o access patterns in hpc applications: A 360-degree survey

JL Bez, S Byna, S Ibrahim - ACM Computing Surveys, 2023 - dl.acm.org
The high-performance computing I/O stack has been complex due to multiple software
layers, the inter-dependencies among these layers, and the different performance tuning …

24/7 characterization of petascale I/O workloads

P Carns, R Latham, R Ross, K Iskra… - … on Cluster Computing …, 2009 - ieeexplore.ieee.org
Develo** and tuning computational science applications to run on extreme scale systems
are increasingly complicated processes. Challenges such as managing memory access and …

Efficient filtering of XML documents with XPath expressions

CY Chan, P Felber, M Garofalakis, R Rastogi - The VLDB Journal, 2002 - Springer
The publish/subscribe paradigm is a popular model for allowing publishers (ie, data
generators) to selectively disseminate data to a large number of widely dispersed …

Slow and steady feature analysis: higher order temporal coherence in video

D Jayaraman, K Grauman - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com
How can unlabeled video augment visual learning? Existing methods perform" slow" feature
analysis, encouraging temporal coherence, where the image representations of temporally …

Decoupled direct memory access: Isolating CPU and IO traffic by leveraging a dual-data-port DRAM

D Lee, L Subramanian… - 2015 International …, 2015 - ieeexplore.ieee.org
Memory channel contention is a critical performance bottleneck in modern systems that have
highly parallelized processing units operating on large data sets. The memory channel is …

Optimizing i/o performance of hpc applications with autotuning

B Behzad, S Byna, Prabhat, M Snir - ACM Transactions on Parallel …, 2019 - dl.acm.org
Parallel Input output is an essential component of modern high-performance computing
(HPC). Obtaining good I/O performance for a broad range of applications on diverse HPC …

Taxonomy of data prefetching for multicore processors

S Byna, Y Chen, XH Sun - Journal of Computer Science and Technology, 2009 - Springer
Data prefetching is an effective data access latency hiding technique to mask the CPU stall
caused by cache misses and to bridge the performance gap between processor and …

[PDF][PDF] Synthesis and simulation of digital systems containing interacting hardware and software components

RK Gupta, CN Coelho Jr, G De Micheli - DAC, 1992 - si2.epfl.ch
Synthesis of systems containing application-specific as well as reprogrammable
components, such as off-the-shelf microprocessors, provides a promising approach to …

Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems

Y Liu, R Gunasekaran, X Ma… - SC'16: Proceedings of …, 2016 - ieeexplore.ieee.org
Inter-application I/O contention and performance interference have been recognized as
severe problems. In this work, we demonstrate, through measurement from Titan (world's …

I/O acceleration with pattern detection

J He, J Bent, A Torres, G Grider, G Gibson… - Proceedings of the …, 2013 - dl.acm.org
The I/O bottleneck in high-performance computing is becoming worse as application data
continues to grow. In this work, we explore how patterns of I/O within these applications can …