Fully integrated FPGA molecular dynamics simulations

C Yang, T Geng, T Wang, R Patel, Q **ong… - Proceedings of the …, 2019 - dl.acm.org
The implementation of Molecular Dynamics (MD) on FPGAs has received substantial
attention. Previous work, however, has consisted of either proof-of-concept implementations …

Reconfigurable switches for high performance and flexible MPI collectives

P Haghi, A Guo, Q **ong, C Yang… - Concurrency and …, 2022 - Wiley Online Library
There has been much effort in offloading MPI collective operations into hardware. But while
NIC‐based collective acceleration is well‐studied, offloading their processing into the …

FPGAs in the network and novel communicator support accelerate MPI collectives

P Haghi, A Guo, Q **ong, R Patel… - 2020 IEEE High …, 2020 - ieeexplore.ieee.org
MPI collective operations can often be performance killers in HPC applications; we seek to
solve this bottleneck by offloading them to reconfigurable hardware within the switch itself …

Secret sharing MPC on FPGAs in the datacenter

PF Wolfe, R Patel, R Munafo, M Varia… - … Conference on Field …, 2020 - ieeexplore.ieee.org
Multi-Party Computation (MPC) is a technique enabling data from several sources to be
used in a secure computation revealing only the result while protecting the original data …

Accelerating MPI collectives with FPGAs in the network and novel communicator support

Q **ong, C Yang, P Haghi, A Skjellum… - 2020 IEEE 28th …, 2020 - ieeexplore.ieee.org
MPI collective operations can often be performance killers in HPC applications; we seek to
solve this bottleneck by offloading them to reconfigurable hardware within the switch itself …

OPTWEB: a lightweight fully connected inter-FPGA network for efficient collectives

K Mizutani, H Yamaguchi, Y Urino… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Modern FPGA accelerators can be equipped with many high-bandwidth network I/Os, eg, 64
x 50 Gbps, enabled by onboard optics or co-packaged optics. Some dozens of tightly …

[PDF][PDF] A novel approach to supporting communicators for in-switch processing of MPI collectives

J Stern, Q **ong, A Skjellum, M Herbordt - Workshop on Exascale MPI, 2018 - bu.edu
ABSTRACT MPI collective operations can often be performance killers in HPC applications;
we seek to solve this bottleneck by offloading them to hardware within the switch itself. We …

[PDF][PDF] Accelerating mpi reduce with fpgas in the network

J Stern, Q **ong, J Sheng, A Skjellum… - Proc Workshop on …, 2017 - bu.edu
ABSTRACT MPI collective operations can often be performance killers in HPC applications,
especially ones that require both heavy communication and computation such as …

Accelerating parallel data processing using optically tightly coupled FPGAs

K Mizutani, H Yamaguchi, Y Urino… - Journal of Optical …, 2022 - ieeexplore.ieee.org
A cutting-edge field programmable gate array (FPGA) card can be equipped with high-
bandwidth inputs and outputs by high-density optical integration, eg, onboard Si-photonics …

ACiS: smart switches with application-level acceleration

P Haghi - 2023 - search.proquest.com
Network performance has contributed fundamentally to the growth of supercomputing over
the past decades. In parallel, High Performance Computing (HPC) peak performance has …