An introduction to the compute express link (cxl) interconnect

D Das Sharma, R Blankenship, D Berger - ACM Computing Surveys, 2024 - dl.acm.org
The Compute Express Link (CXL) is an open industry-standard interconnect between
processors and devices such as accelerators, memory buffers, smart network interfaces …

Empowering cloud computing with network acceleration: a survey

L Rosa, L Foschini, A Corradi - IEEE Communications Surveys …, 2024 - ieeexplore.ieee.org
Modern interactive and data-intensive applications must operate under demanding time
constraints, prompting a shift toward the adoption of specialized software and hardware …

Clio: A hardware-software co-designed disaggregated memory system

Z Guo, Y Shan, X Luo, Y Huang, Y Zhang - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Memory disaggregation has attracted great attention recently because of its benefits in
efficient memory utilization and ease of management. So far, memory disaggregation …

Electrode: Accelerating Distributed Protocols with {eBPF}

Y Zhou, Z Wang, S Dharanipragada, M Yu - 20th USENIX Symposium …, 2023 - usenix.org
Implementing distributed protocols under a standard Linux kernel networking stack enjoys
the benefits of load-aware CPU scaling, high compatibility, and robust security and isolation …

High-throughput and flexible host networking for accelerated computing

A Skiadopoulos, Z **e, M Zhao, Q Cai… - … USENIX Symposium on …, 2024 - usenix.org
Modern network hardware is able to meet the stringent bandwidth demands of applications
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …

Paella: Low-latency model serving with software-defined gpu scheduling

KKW Ng, HM Demoulin, V Liu - Proceedings of the 29th Symposium on …, 2023 - dl.acm.org
Model serving systems play a critical role in multiplexing machine learning inference jobs
across shared GPU infrastructure. These systems have traditionally sat at a high level of …

Cornflakes: Zero-copy serialization for microsecond-scale networking

D Raghavan, S Ravi, G Yuan, P Thaker… - Proceedings of the 29th …, 2023 - dl.acm.org
Data serialization is critical for many datacenter applications, but the memory copies
required to move application data into packets are costly. Recent zero-copy APIs expose …

Making kernel bypass practical for the cloud with junction

J Fried, GI Chaudhry, E Saurez, E Choukse… - … USENIX Symposium on …, 2024 - usenix.org
Kernel bypass systems have demonstrated order of magnitude improvements in throughput
and tail latency for network-intensive applications relative to traditional operating systems …

Peeling back the carbon curtain: Carbon optimization challenges in cloud computing

J Wang, U Gupta, A Sriraman - Proceedings of the 2nd Workshop on …, 2023 - dl.acm.org
The increasing carbon emissions from cloud computing requires new methods to reduce its
environmental impact. We explore extending data center server lifetimes to reduce …

Towards μs tail latency and terabit ethernet: disaggregating the host network stack

Q Cai, M Vuppalapati, J Hwang, C Kozyrakis… - Proceedings of the …, 2022 - dl.acm.org
Dedicated, tightly integrated, and static packet processing pipelines in today's most widely
deployed network stacks preclude them from fully exploiting capabilities of modern …