Flow optimization strategies in data center networks: A survey

Y Liu, T Yu, Q Meng, Q Liu - Journal of Network and Computer Applications, 2024 - Elsevier
In the era of digitization, Data Center Networks (DCN) have emerged as a critical component
supporting infrastructure for cloud computing, big data analytics, online services, and more …

Towards {Domain-Specific} network transport for distributed {DNN} training

H Wang, H Tian, J Chen, X Wan, J **a, G Zeng… - … USENIX Symposium on …, 2024 - usenix.org
The nature of machine learning (ML) applications exposes rich characteristics to underlying
network transport, yet little work has been done so far to systematically exploit these …

ABM: Active buffer management in datacenters

V Addanki, M Apostolaki, M Ghobadi… - Proceedings of the …, 2022 - dl.acm.org
Today's network devices share buffer across queues to avoid drops during transient
congestion and absorb bursts. As the buffer-per-bandwidth-unit in datacenter decreases, the …

Reverie: Low Pass {Filter-Based} Switch Buffer Sharing for Datacenters with {RDMA} and {TCP} Traffic

V Addanki, W Bai, S Schmid, M Apostolaki - 21st USENIX Symposium on …, 2024 - usenix.org
The switch buffers in datacenters today are shared by traffic classes with different loss
tolerance and reaction to congestion signals. In particular, while legacy applications use …

Enhancing load balancing with in-network recirculation to prevent packet reordering in lossless data centers

J Hu, Y He, W Luo, J Huang… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Many existing load balancing mechanisms work effectively in lossy datacenter networks
(DCNs), but they suffer from serious packet reordering in lossless Ethernet DCNs deployed …

A microscopic view of bursts, buffer contention, and loss in data centers

E Ghabashneh, Y Zhao, C Lumezanu… - Proceedings of the …, 2022 - dl.acm.org
Managing data center networks with low loss requires understanding traffic dynamics at
short (millisecond) time-scales, especially the burstiness of traffic, and to what extent bursts …

xnet: Improving expressiveness and granularity for network modeling with graph neural networks

M Wang, L Hui, Y Cui, R Liang… - IEEE INFOCOM 2022 …, 2022 - ieeexplore.ieee.org
Today's network is notorious for its complexity and uncertainty. Network operators often rely
on network models to achieve efficient network planning, operation, and optimization. The …

Credence: Augmenting Datacenter Switch Buffer Sharing with {ML} Predictions

V Addanki, M Pacut, S Schmid - 21st USENIX symposium on networked …, 2024 - usenix.org
Packet buffers in datacenter switches are shared across all the switch ports in order to
improve the overall throughput. The trend of shrinking buffer sizes in datacenter switches …

Flow scheduling with imprecise knowledge

W Li, X He, Y Liu, K Li, K Chen, Z Ge, Z Guan… - … USENIX Symposium on …, 2024 - usenix.org
Most existing data center network (DCN) flow scheduling solutions aim to minimize flow
completion times (FCT). However, these solutions either require precise flow information …

PACC: Proactive and accurate congestion feedback for RDMA congestion control

X Zhong, J Zhang, Y Zhang, Z Guan… - IEEE INFOCOM 2022 …, 2022 - ieeexplore.ieee.org
The rapid upgrade of link speed and the prosperity of new applications in data center
networks (DCNs) lead to a rigorous demand for ultra-low latency and high throughput. To …