Integrating quantum computing resources into scientific HPC ecosystems

T Beck, A Baroni, R Bennink, G Buchs… - Future Generation …, 2024 - Elsevier
Quantum Computing (QC) offers significant potential to enhance scientific discovery in fields
such as quantum chemistry, optimization, and artificial intelligence. Yet QC faces challenges …

Frontier: exploring exascale

S Atchley, C Zimmer, J Lange, D Bernholdt… - Proceedings of the …, 2023 - dl.acm.org
As the US Department of Energy (DOE) computing facilities began deploying petascale
systems in 2008, DOE was already setting its sights on exascale. In that year, DARPA …

Enabling rapid COVID-19 small molecule drug design through scalable deep learning of generative models

SA Jacobs, T Moon, K McLoughlin… - … Journal of High …, 2021 - journals.sagepub.com
We improved the quality and reduced the time to produce machine learned models for use
in small molecule antiviral design. Our globally asynchronous multi-level parallel training …

PolarFly: a cost-effective and flexible low-diameter topology

K Lakhotia, M Besta, L Monroe, K Isham… - … Conference for High …, 2022 - ieeexplore.ieee.org
In this paper we present PolarFly, a diameter-2 network topology based on the Erdos-Renyi
family of polarity graphs from finite geometry. This is the first known diameter-2 topology that …

Numerical algorithms for high-performance computational science

J Dongarra, L Grigori… - … Transactions of the …, 2020 - royalsocietypublishing.org
A number of features of today's high-performance computers make it challenging to exploit
these machines fully for computational science. These include increasing core counts but …

Workload imbalance in hpc applications: Effect on performance of in-network processing

P Haghi, A Guo, T Geng, A Skjellum… - 2021 IEEE High …, 2021 - ieeexplore.ieee.org
As HPC systems advance to exascale, communication networks are becoming ever more
complex including, eg, support for in-network processing. While critical in facilitating …

Single‐and multi‐GPU computing on NVIDIA‐and AMD‐based server platforms for solidification modeling application

K Halbiniak, N Meyer, K Rojek - Concurrency and Computation …, 2024 - Wiley Online Library
This work explores the performance of single‐and multi‐GPU computing on state‐of‐the‐art
NVIDIA‐and AMD‐based server‐class hardware using various programming interfaces to …

Characterizing performance of graph neighborhood communication patterns

S Ghosh, NR Tallent… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Distributed-memory graph algorithms are fundamental enablers in scientific computing and
analytics workflows. A majority of graph algorithms rely on the graph neighborhood …

Improving communication by optimizing on-node data movement with data layout

T Zhao, M Hall, H Johansen, S Williams - Proceedings of the 26th ACM …, 2021 - dl.acm.org
We present optimizations to improve communication performance by reducing on-node data
movement for a class of distributed memory applications. The primary concept is to eliminate …

Towards efficient remote openmp offloading

W Lu, B Shan, E Raut, J Meng, M Araya-Polo… - … Workshop on OpenMP, 2022 - Springer
On modern heterogeneous HPC systems, the most popular way to realize distributed
computation is the hybrid programming model of MPI+ X (X being OpenMP/CUDA/etc.), as it …