Combining federated learning and edge computing toward ubiquitous intelligence in 6G network: Challenges, recent advances, and future directions

Q Duan, J Huang, S Hu, R Deng… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Full leverage of the huge volume of data generated on a large number of user devices for
providing intelligent services in the 6G network calls for Ubiquitous Intelligence (UI). A key to …

End-edge-cloud collaborative computing for deep learning: A comprehensive survey

Y Wang, C Yang, S Lan, L Zhu… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
The booming development of deep learning applications and services heavily relies on
large deep learning models and massive data in the cloud. However, cloud-based deep …

{MLaaS} in the wild: Workload analysis and scheduling in {Large-Scale} heterogeneous {GPU} clusters

Q Weng, W **ao, Y Yu, W Wang, C Wang, J He… - … USENIX Symposium on …, 2022 - usenix.org
With the sustained technological advances in machine learning (ML) and the availability of
massive datasets recently, tech companies are deploying large ML-as-a-Service (MLaaS) …

A unified architecture for accelerating distributed {DNN} training in heterogeneous {GPU/CPU} clusters

Y Jiang, Y Zhu, C Lan, B Yi, Y Cui, C Guo - 14th USENIX Symposium on …, 2020 - usenix.org
Data center clusters that run DNN training jobs are inherently heterogeneous. They have
GPUs and CPUs for computation and network bandwidth for distributed training. However …

An efficient design of intelligent network data plane

G Zhou, Z Liu, C Fu, Q Li, K Xu - 32nd USENIX Security Symposium …, 2023 - usenix.org
Deploying machine learning models directly on the network data plane enables intelligent
traffic analysis at line-speed using data-driven models rather than predefined protocols …

In-network machine learning using programmable network devices: A survey

C Zheng, X Hong, D Ding, S Vargaftik… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Machine learning is widely used to solve networking challenges, ranging from traffic
classification and anomaly detection to network configuration. However, machine learning …

A survey on data plane programming with p4: Fundamentals, advances, and applied research

F Hauser, M Häberle, D Merling, S Lindner… - Journal of Network and …, 2023 - Elsevier
Programmable data planes allow users to define their own data plane algorithms for network
devices including appropriate data plane application programming interfaces (APIs) which …

{ATP}: In-network aggregation for multi-tenant learning

CL Lao, Y Le, K Mahajan, Y Chen, W Wu… - … USENIX Symposium on …, 2021 - usenix.org
Distributed deep neural network training (DT) systems are widely deployed in clusters where
the network is shared across multiple tenants, ie, multiple DT jobs. Each DT job computes …

Software-hardware co-design for fast and scalable training of deep learning recommendation models

D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch… - Proceedings of the 49th …, 2022 - dl.acm.org
Deep learning recommendation models (DLRMs) have been used across many business-
critical services at Meta and are the single largest AI application in terms of infrastructure …

A guide through the zoo of biased SGD

Y Demidovich, G Malinovsky… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Stochastic Gradient Descent (SGD) is arguably the most important single algorithm
in modern machine learning. Although SGD with unbiased gradient estimators has been …