Integrating quantum computing resources into scientific HPC ecosystems
Quantum Computing (QC) offers significant potential to enhance scientific discovery in fields
such as quantum chemistry, optimization, and artificial intelligence. Yet QC faces challenges …
such as quantum chemistry, optimization, and artificial intelligence. Yet QC faces challenges …
Frontier: exploring exascale
As the US Department of Energy (DOE) computing facilities began deploying petascale
systems in 2008, DOE was already setting its sights on exascale. In that year, DARPA …
systems in 2008, DOE was already setting its sights on exascale. In that year, DARPA …
Enabling rapid COVID-19 small molecule drug design through scalable deep learning of generative models
SA Jacobs, T Moon, K McLoughlin… - … Journal of High …, 2021 - journals.sagepub.com
We improved the quality and reduced the time to produce machine learned models for use
in small molecule antiviral design. Our globally asynchronous multi-level parallel training …
in small molecule antiviral design. Our globally asynchronous multi-level parallel training …
PolarFly: a cost-effective and flexible low-diameter topology
In this paper we present PolarFly, a diameter-2 network topology based on the Erdos-Renyi
family of polarity graphs from finite geometry. This is the first known diameter-2 topology that …
family of polarity graphs from finite geometry. This is the first known diameter-2 topology that …
Numerical algorithms for high-performance computational science
A number of features of today's high-performance computers make it challenging to exploit
these machines fully for computational science. These include increasing core counts but …
these machines fully for computational science. These include increasing core counts but …
Workload imbalance in hpc applications: Effect on performance of in-network processing
As HPC systems advance to exascale, communication networks are becoming ever more
complex including, eg, support for in-network processing. While critical in facilitating …
complex including, eg, support for in-network processing. While critical in facilitating …
Single‐and multi‐GPU computing on NVIDIA‐and AMD‐based server platforms for solidification modeling application
This work explores the performance of single‐and multi‐GPU computing on state‐of‐the‐art
NVIDIA‐and AMD‐based server‐class hardware using various programming interfaces to …
NVIDIA‐and AMD‐based server‐class hardware using various programming interfaces to …
Characterizing performance of graph neighborhood communication patterns
Distributed-memory graph algorithms are fundamental enablers in scientific computing and
analytics workflows. A majority of graph algorithms rely on the graph neighborhood …
analytics workflows. A majority of graph algorithms rely on the graph neighborhood …
Improving communication by optimizing on-node data movement with data layout
We present optimizations to improve communication performance by reducing on-node data
movement for a class of distributed memory applications. The primary concept is to eliminate …
movement for a class of distributed memory applications. The primary concept is to eliminate …
Towards efficient remote openmp offloading
On modern heterogeneous HPC systems, the most popular way to realize distributed
computation is the hybrid programming model of MPI+ X (X being OpenMP/CUDA/etc.), as it …
computation is the hybrid programming model of MPI+ X (X being OpenMP/CUDA/etc.), as it …