5G support for industrial IoT applications—challenges, solutions, and research gaps
Industrial IoT has special communication requirements, including high reliability, low
latency, flexibility, and security. These are instinctively provided by the 5G mobile …
latency, flexibility, and security. These are instinctively provided by the 5G mobile …
Empowering azure storage with {RDMA}
W Bai, SS Abdeen, A Agrawal, KK Attre, P Bahl… - … USENIX Symposium on …, 2023 - usenix.org
Given the wide adoption of disaggregated storage in public clouds, networking is the key to
enabling high performance and high reliability in a cloud storage service. In Azure, we …
enabling high performance and high reliability in a cloud storage service. In Azure, we …
A taxonomy of live migration management in cloud computing
Cloud Data Centers have become the key infrastructure for providing services. Instance
migration across different computing nodes in edge and cloud computing is essential to …
migration across different computing nodes in edge and cloud computing is essential to …
Disaggregating persistent memory and controlling them remotely: An exploration of passive disaggregated {Key-Value} stores
Many datacenters and clouds manage storage systems separately from computing services
for better manageability and resource utilization. These existing disaggregated storage …
for better manageability and resource utilization. These existing disaggregated storage …
A Survey of Storage Systems in the RDMA era
Remote Direct Memory Access (RDMA) based network devices are increasingly being
deployed in modern data centers. RDMA brings significant performance improvements over …
deployed in modern data centers. RDMA brings significant performance improvements over …
Transparent {GPU} sharing in container clouds for deep learning workloads
Containers are widely used for resource management in datacenters. A common practice to
support deep learning (DL) training in container clouds is to statically bind GPUs to …
support deep learning (DL) training in container clouds is to statically bind GPUs to …
High-throughput and flexible host networking for accelerated computing
Modern network hardware is able to meet the stringent bandwidth demands of applications
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …
like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff …
Collie: Finding Performance Anomalies in {RDMA} Subsystems
High-speed RDMA networks are getting rapidly adopted in the industry for their low latency
and reduced CPU overheads. To verify that RDMA can be used in production, system …
and reduced CPU overheads. To verify that RDMA can be used in production, system …
Slim:{OS} Kernel Support for a {Low-Overhead} Container Overlay Network
Containers have become the de facto method for hosting large-scale distributed
applications. Container overlay networks are essential to providing portability for containers …
applications. Container overlay networks are essential to providing portability for containers …
Understanding {RDMA} microarchitecture resources for performance isolation
Recent years have witnessed the wide adoption of RDMA in the cloud to accelerate first-
party workloads and achieve cost savings by freeing up CPU cycles. Now cloud providers …
party workloads and achieve cost savings by freeing up CPU cycles. Now cloud providers …