Effective performance portability
Exascale computing brings with it diverse machine architectures and programming
approaches which challenge application developers. Applications need to perform well on a …
approaches which challenge application developers. Applications need to perform well on a …
Scaling embedded in-situ indexing with deltaFS
Analysis of large-scale simulation output is a core element of scientific inquiry, but analysis
queries may experience significant I/O overhead when the data is not structured for efficient …
queries may experience significant I/O overhead when the data is not structured for efficient …
Tuning parallel data compression and i/o for large-scale earthquake simulation
Scientific applications, such as those simulating earthquakes, the origins of universe, etc.,
often produce massive amounts of data as high-performance computing (HPC) systems are …
often produce massive amounts of data as high-performance computing (HPC) systems are …
TAPIOCA: An I/O library for optimized topology-aware data aggregation on large-scale supercomputers
Reading and writing data efficiently from storage system is necessary for most scientific
simulations to achieve good performance at scale. Many software solutions have been …
simulations to achieve good performance at scale. Many software solutions have been …
Sctuner: An autotuner addressing dynamic i/o needs on supercomputer i/o subsystems
In high-performance computing (HPC), scientific applications often manage a massive
amount of data using I/O libraries. These libraries provide convenient data model …
amount of data using I/O libraries. These libraries provide convenient data model …
Battle of the defaults: Extracting performance characteristics of HDF5 under production load
Popular parallel I/O libraries, such as HDF5, provide tuning parameters to obtain superior
performance. However, the selection of effective parameters on production systems is …
performance. However, the selection of effective parameters on production systems is …
Optimizing gpu-enhanced hpc system and cloud procurements for scientific workloads
Modern GPUs are capable of sustaining floating point operation rates and memory
bandwidths that exceed those of most currently available CPUs, making them attractive …
bandwidths that exceed those of most currently available CPUs, making them attractive …
Software-defined storage for fast trajectory queries using a deltafs indexed massive directory
In this paper we introduce the Indexed Massive Directory, a new technique for indexing data
within DeltaFS. With its design as a scalable, server-less file system for HPC platforms …
within DeltaFS. With its design as a scalable, server-less file system for HPC platforms …
Unleashing in-network computing on scientific workloads
Many recent efforts have shown that in-network computing can benefit various datacenter
applications. In this paper, we explore a relatively less-explored domain which we argue can …
applications. In this paper, we explore a relatively less-explored domain which we argue can …
Analysis of Vector Particle-In-Cell (VPIC) memory usage optimizations on cutting-edge computer architectures
Abstract Vector Particle-In-Cell (VPIC) is one of the fastest plasma simulation codes in the
world, with particle numbers ranging from one trillion on the first petascale system …
world, with particle numbers ranging from one trillion on the first petascale system …