Communication-efficient distributed learning: An overview

X Cao, T Başar, S Diggavi, YC Eldar… - IEEE journal on …, 2023 - ieeexplore.ieee.org
Distributed learning is envisioned as the bedrock of next-generation intelligent networks,
where intelligent agents, such as mobile devices, robots, and sensors, exchange information …

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arxiv preprint arxiv:2003.06307, 2020 - arxiv.org
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

EF21: A new, simpler, theoretically better, and practically faster error feedback

P Richtárik, I Sokolov… - Advances in Neural …, 2021 - proceedings.neurips.cc
Error feedback (EF), also known as error compensation, is an immensely popular
convergence stabilization mechanism in the context of distributed training of supervised …

Optimal client sampling for federated learning

W Chen, S Horvath, P Richtarik - arxiv preprint arxiv:2010.13723, 2020 - arxiv.org
It is well understood that client-master communication can be a primary bottleneck in
Federated Learning. In this work, we address this issue with a novel client subsampling …

Local sgd: Unified theory and new efficient methods

E Gorbunov, F Hanzely… - … Conference on Artificial …, 2021 - proceedings.mlr.press
We present a unified framework for analyzing local SGD methods in the convex and strongly
convex regimes for distributed/federated training of supervised machine learning models …

Stochastic distributed learning with gradient quantization and double-variance reduction

S Horváth, D Kovalev, K Mishchenko… - Optimization Methods …, 2023 - Taylor & Francis
We consider distributed optimization over several devices, each sending incremental model
updates to a central server. This setting is considered, for instance, in federated learning …

Natural compression for distributed deep learning

S Horvóth, CY Ho, L Horvath, AN Sahu… - Mathematical and …, 2022 - proceedings.mlr.press
Modern deep learning models are often trained in parallel over a collection of distributed
machines to reduce training time. In such settings, communication of model updates among …

A unified theory of SGD: Variance reduction, sampling, quantization and coordinate descent

E Gorbunov, F Hanzely… - … Conference on Artificial …, 2020 - proceedings.mlr.press
In this paper we introduce a unified analysis of a large family of variants of proximal
stochastic gradient descent (SGD) which so far have required different intuitions …

A better alternative to error feedback for communication-efficient distributed learning

S Horváth, P Richtárik - arxiv preprint arxiv:2006.11077, 2020 - arxiv.org
Modern large-scale machine learning applications require stochastic optimization
algorithms to be implemented on distributed compute systems. A key bottleneck of such …

EF-BV: A unified theory of error feedback and variance reduction mechanisms for biased and unbiased compression in distributed optimization

L Condat, K Yi, P Richtárik - Advances in Neural …, 2022 - proceedings.neurips.cc
In distributed or federated optimization and learning, communication between the different
computing units is often the bottleneck and gradient compression is widely used to reduce …