Characterizing off-path {SmartNIC} for accelerating distributed systems

X Wei, R Cheng, Y Yang, R Chen, H Chen - 17th USENIX Symposium …, 2023 - usenix.org
SmartNICs have recently emerged as an appealing device for accelerating distributed
systems. However, there has not been a comprehensive characterization of SmartNICs, and …

Clover: Toward sustainable ai with carbon-aware machine learning inference service

B Li, S Samsi, V Gadepally, D Tiwari - Proceedings of the International …, 2023 - dl.acm.org
This paper presents a solution to the challenge of mitigating carbon emissions from hosting
large-scale machine learning (ML) inference services. ML inference is critical to modern …

Legion: Automatically pushing the envelope of {Multi-GPU} system for {Billion-Scale}{GNN} training

J Sun, L Su, Z Shi, W Shen, Z Wang, L Wang… - 2023 USENIX Annual …, 2023 - usenix.org
Graph neural network (GNN) has been widely applied in real-world applications, such as
product recommendation in e-commerce platforms and risk control in financial management …

Lightning: A reconfigurable photonic-electronic smartnic for fast and energy-efficient inference

Z Zhong, M Yang, J Lang, C Williams… - Proceedings of the …, 2023 - dl.acm.org
The massive growth of machine learning-based applications and the end of Moore's law
have created a pressing need to redesign computing platforms. We propose Lightning, the …

Compressgraph: Efficient parallel graph analytics with rule-based compression

Z Chen, F Zhang, JW Guan, J Zhai, X Shen… - Proceedings of the …, 2023 - dl.acm.org
Modern graphs exert colossal time and space pressure on graph analytics applications. In
2022, Facebook social graph reaches 2.91 billion users with trillions of edges. Many …

{ACCL+}: an {FPGA-Based} Collective Engine for Distributed Applications

Z He, D Korolija, Y Zhu, B Ramhorst, T Laan… - … USENIX Symposium on …, 2024 - usenix.org
FPGAs are increasingly prevalent in cloud deployments, serving as Smart-NICs or network-
attached accelerators. To facilitate the development of distributed applications with FPGAs …

Towards a fully disaggregated and programmable data center

Y Shan, W Lin, Z Guo, Y Zhang - Proceedings of the 13th ACM SIGOPS …, 2022 - dl.acm.org
Today, we are seeing two trends in the data center. On the one hand, applications are
becoming more fine-grained, driven by the recent trend of serverless computing and …

{STYX}: Exploiting {SmartNIC} capability to reduce datacenter memory tax

H Ji, M Mansi, Y Sun, Y Yuan, J Huang… - 2023 USENIX Annual …, 2023 - usenix.org
Memory optimization kernel features, such as memory deduplication, are designed to
improve the overall efficiency of systems like datacenter servers, and they have proven to be …

Smartds: Middle-tier-centric smartnic enabling application-aware message split for disaggregated block storage

J Zhang, H Huang, L Zhu, S Ma, D Rong… - Proceedings of the 50th …, 2023 - dl.acm.org
The widespread deployment of storage disaggregation in the cloud has facilitated flexible
scaling and storage overprovisioning, allowing for high utilization of storage capacity and …

Flagger: Cooperative acceleration for large-scale cross-silo federated learning aggregation

X Pan, Y An, S Liang, B Mao, M Zhang… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Cross-silo federated learning (FL) leverages homomorphic encryption (HE) to obscure the
model updates from the clients. However, HE poses the challenges of complex …