I/o access patterns in hpc applications: A 360-degree survey
The high-performance computing I/O stack has been complex due to multiple software
layers, the inter-dependencies among these layers, and the different performance tuning …
layers, the inter-dependencies among these layers, and the different performance tuning …
24/7 characterization of petascale I/O workloads
Develo** and tuning computational science applications to run on extreme scale systems
are increasingly complicated processes. Challenges such as managing memory access and …
are increasingly complicated processes. Challenges such as managing memory access and …
Efficient filtering of XML documents with XPath expressions
The publish/subscribe paradigm is a popular model for allowing publishers (ie, data
generators) to selectively disseminate data to a large number of widely dispersed …
generators) to selectively disseminate data to a large number of widely dispersed …
Slow and steady feature analysis: higher order temporal coherence in video
How can unlabeled video augment visual learning? Existing methods perform" slow" feature
analysis, encouraging temporal coherence, where the image representations of temporally …
analysis, encouraging temporal coherence, where the image representations of temporally …
Decoupled direct memory access: Isolating CPU and IO traffic by leveraging a dual-data-port DRAM
Memory channel contention is a critical performance bottleneck in modern systems that have
highly parallelized processing units operating on large data sets. The memory channel is …
highly parallelized processing units operating on large data sets. The memory channel is …
Optimizing i/o performance of hpc applications with autotuning
Parallel Input output is an essential component of modern high-performance computing
(HPC). Obtaining good I/O performance for a broad range of applications on diverse HPC …
(HPC). Obtaining good I/O performance for a broad range of applications on diverse HPC …
Taxonomy of data prefetching for multicore processors
Data prefetching is an effective data access latency hiding technique to mask the CPU stall
caused by cache misses and to bridge the performance gap between processor and …
caused by cache misses and to bridge the performance gap between processor and …
[PDF][PDF] Synthesis and simulation of digital systems containing interacting hardware and software components
Synthesis of systems containing application-specific as well as reprogrammable
components, such as off-the-shelf microprocessors, provides a promising approach to …
components, such as off-the-shelf microprocessors, provides a promising approach to …
Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems
Y Liu, R Gunasekaran, X Ma… - SC'16: Proceedings of …, 2016 - ieeexplore.ieee.org
Inter-application I/O contention and performance interference have been recognized as
severe problems. In this work, we demonstrate, through measurement from Titan (world's …
severe problems. In this work, we demonstrate, through measurement from Titan (world's …
I/O acceleration with pattern detection
The I/O bottleneck in high-performance computing is becoming worse as application data
continues to grow. In this work, we explore how patterns of I/O within these applications can …
continues to grow. In this work, we explore how patterns of I/O within these applications can …