Smartfuse: Reconfigurable smart switches to accelerate fused collectives in hpc applications

P Haghi, C Tan, A Guo, C Wu, D Liu, A Li… - Proceedings of the 38th …, 2024 - dl.acm.org
Communication switches have sometimes been augmented to process collectives, eg, in the
IBM BlueGene and Mellanox SHArP switches. In this work, we find that there is a great …

{ACCL+}: an {FPGA-Based} Collective Engine for Distributed Applications

Z He, D Korolija, Y Zhu, B Ramhorst, T Laan… - … USENIX Symposium on …, 2024 - usenix.org
FPGAs are increasingly prevalent in cloud deployments, serving as Smart-NICs or network-
attached accelerators. To facilitate the development of distributed applications with FPGAs …

A framework for neural network inference on fpga-centric smartnics

A Guo, T Geng, Y Zhang, P Haghi, C Wu… - … Conference on Field …, 2022 - ieeexplore.ieee.org
FPGA-based SmartNICs offer great potential to significantly improve the performance of high-
performance computing and warehouse data processing by tightly coupling support for …

Workload imbalance in hpc applications: Effect on performance of in-network processing

P Haghi, A Guo, T Geng, A Skjellum… - 2021 IEEE High …, 2021 - ieeexplore.ieee.org
As HPC systems advance to exascale, communication networks are becoming ever more
complex including, eg, support for in-network processing. While critical in facilitating …

FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks

A Guo, T Geng, Y Zhang, P Haghi, C Wu… - 2022 IEEE 30th …, 2022 - ieeexplore.ieee.org
Network communication is increasingly becoming the performance bottleneck for scaled-out
HPC and warehouse applications, as enormous CPU processing is devoted to packet …

Distributed hardware accelerated secure joint computation on the COPA framework

R Patel, P Haghi, S Jain, A Kot… - 2022 IEEE High …, 2022 - ieeexplore.ieee.org
Performance of distributed data center applications can be improved through use of FPGA-
based SmartNICs, which provide additional functionality and enable higher bandwidth …

A Survey of Potential MPI Complex Collectives: Large-Scale Mining and Analysis of HPC Applications

P Haghi, R Marshall, PH Chen, A Skjellum… - arxiv preprint arxiv …, 2023 - arxiv.org
Offload of MPI collectives to network devices, eg, NICs and switches, is being implemented
as an effective mechanism to improve application performance by reducing inter-and intra …

ACiS: smart switches with application-level acceleration

P Haghi - 2023 - search.proquest.com
Network performance has contributed fundamentally to the growth of supercomputing over
the past decades. In parallel, High Performance Computing (HPC) peak performance has …

ACiS: Complex Processing in the Switch Fabric

P Haghi, A Guo, T Geng, A Skjellum… - arxiv preprint arxiv …, 2025 - arxiv.org
For the last three decades a core use of FPGAs has been for processing communication:
FPGA-based SmartNICs are in widespread use from the datacenter to IoT. Augmenting …

Scalable star array testbed

PFW Wolfe, KE Kolodziej - 2022 IEEE International Symposium …, 2022 - ieeexplore.ieee.org
Phased array radar systems provide efficient utilization of their link budgets by focusing
antenna radiation in a particular direction. Same frequency Simultaneous Transmit and …