Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization

L Chen, J Lingys, K Chen, F Liu - Proceedings of the 2018 conference of …, 2018‏ - dl.acm.org
Traffic optimizations (TO, eg flow scheduling, load balancing) in datacenters are difficult
online decision-making problems. Previously, they are done with heuristics relying on …

Location-aware and budget-constrained service deployment for composite applications in multi-cloud environment

T Shi, H Ma, G Chen, S Hartmann - IEEE Transactions on …, 2020‏ - ieeexplore.ieee.org
Enterprise application providers are increasingly moving their workloads to the cloud for
technical and economic benefits. Multi-cloud environment makes it possible to orchestrate …

Cost-effective web application replication and deployment in multi-cloud environment

T Shi, H Ma, G Chen, S Hartmann - IEEE Transactions on …, 2021‏ - ieeexplore.ieee.org
Multi-cloud is becoming a popular cloud ecosystem because it allows enterprise users to
share the workload across multiple cloud service providers to achieve high-quality services …

Flow scheduling with imprecise knowledge

W Li, X He, Y Liu, K Li, K Chen, Z Ge, Z Guan… - … USENIX Symposium on …, 2024‏ - usenix.org
Most existing data center network (DCN) flow scheduling solutions aim to minimize flow
completion times (FCT). However, these solutions either require precise flow information …

Uniform-Cost Multi-Path Routing for Reconfigurable Data Center Networks

J Li, H Gong, F De Marchi, A Gong, Y Lei… - Proceedings of the …, 2024‏ - dl.acm.org
Reconfigurable data center networks (RDCNs) are arising as a promising data center
network (DCN) design in the post-Moore's law era. However, the constantly reconfigured …

A receiver-driven transport protocol with high link utilization using anti-ECN marking in data center networks

J Hu, J Huang, Z Li, J Wang, T He - IEEE Transactions on …, 2022‏ - ieeexplore.ieee.org
Existing reactive or proactive congestion control protocols are hard to simultaneously
achieve ultra-low latency and high link utilization across all workloads ranging from delay …

MLTCP: A Distributed Technique to Approximate Centralized Flow Scheduling For Machine Learning

S Rajasekaran, S Narang, AA Zabreyko… - Proceedings of the 23rd …, 2024‏ - dl.acm.org
This paper argues that congestion control protocols in machine learning datacenters sit at a
sweet spot between centralized and distributed flow scheduling solutions. We present …

Network monitoring on multi-pipe switches

M Chiesa, FL Verdi - Proceedings of the ACM on Measurement and …, 2023‏ - dl.acm.org
Programmable switches have been widely used to design network monitoring solutions that
operate in the fast data-plane level, eg, detecting heavy hitters, super-spreaders, computing …

Traffic modeling and optimization in datacenters with graph neural network

J Li, P Sun, Y Hu - Computer Networks, 2020‏ - Elsevier
Traffic Optimization (TO) is a well-known and established topic in datacenters with the
fundamental goal of operating networks efficiently. Traditional TO heuristics may suffer from …

Load balancing with traffic isolation in data center networks

T Zhang, Q Zhang, Y Lei, S Zou, J Huang… - Future Generation …, 2022‏ - Elsevier
The topologies of current data center networks are typically multi-rooted trees (eg leaf–
spine) with rich parallel paths between any pair of hosts. Recent progress has demonstrated …