{OSMOSIS}: Enabling {Multi-Tenancy} in Datacenter {SmartNICs}

M Khalilov, M Chrapek, S Shen, A Vezzu… - 2024 USENIX Annual …, 2024‏ - usenix.org
Multi-tenancy is essential for unleashing SmartNIC's potential in datacenters. Our systematic
analysis in this work shows that existing on-path SmartNICs have resource multiplexing …

Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI

M Khalilov, S Di Girolamo, M Chrapek… - … Conference for High …, 2024‏ - ieeexplore.ieee.org
In the Fully Sharded Data Parallel (FSDP) training pipeline, collective operations can be
interleaved to maximize the communication/computation overlap. In this scenario …

Accelerating lossy and lossless compression on emerging bluefield dpu architectures

Y Li, A Kashyap, W Chen, Y Guo… - 2024 IEEE International …, 2024‏ - ieeexplore.ieee.org
Data compression has become a crucial technique in addressing performance bottlenecks
caused by increasing data volumes in High-Performance Computing (HPC), Big Data, and …

Arcus: SLO Management for Accelerators in the Cloud with Traffic Sha**

J Zhao, R Shu, K Lim, Z Fan, T Anderson… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Cloud servers use accelerators for common tasks (eg, encryption, compression, hashing) to
improve CPU/GPU efficiency and overall performance. However, users' Service-level …

Offloaded MPI message matching: an optimistic approach

JS García, S Di Girolamo, S Kosta… - SC24-W: Workshops …, 2024‏ - ieeexplore.ieee.org
Message matching is a critical process ensuring the correct delivery of messages in
distributed and HPC environments. The advent of SmartNICs presents an opportunity to …