{OSMOSIS}: Enabling {Multi-Tenancy} in Datacenter {SmartNICs}
Multi-tenancy is essential for unleashing SmartNIC's potential in datacenters. Our systematic
analysis in this work shows that existing on-path SmartNICs have resource multiplexing …
analysis in this work shows that existing on-path SmartNICs have resource multiplexing …
Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI
In the Fully Sharded Data Parallel (FSDP) training pipeline, collective operations can be
interleaved to maximize the communication/computation overlap. In this scenario …
interleaved to maximize the communication/computation overlap. In this scenario …
Accelerating lossy and lossless compression on emerging bluefield dpu architectures
Data compression has become a crucial technique in addressing performance bottlenecks
caused by increasing data volumes in High-Performance Computing (HPC), Big Data, and …
caused by increasing data volumes in High-Performance Computing (HPC), Big Data, and …
Arcus: SLO Management for Accelerators in the Cloud with Traffic Sha**
Cloud servers use accelerators for common tasks (eg, encryption, compression, hashing) to
improve CPU/GPU efficiency and overall performance. However, users' Service-level …
improve CPU/GPU efficiency and overall performance. However, users' Service-level …
Offloaded MPI message matching: an optimistic approach
Message matching is a critical process ensuring the correct delivery of messages in
distributed and HPC environments. The advent of SmartNICs presents an opportunity to …
distributed and HPC environments. The advent of SmartNICs presents an opportunity to …