A guide through the zoo of biased SGD

Y Demidovich, G Malinovsky… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Stochastic Gradient Descent (SGD) is arguably the most important single algorithm
in modern machine learning. Although SGD with unbiased gradient estimators has been …

On biased compression for distributed learning

A Beznosikov, S Horváth, P Richtárik… - Journal of Machine …, 2023 - jmlr.org
In the last few years, various communication compression techniques have emerged as an
indispensable tool hel** to alleviate the communication bottleneck in distributed learning …

SoteriaFL: A unified framework for private federated learning with communication compression

Z Li, H Zhao, B Li, Y Chi - Advances in Neural Information …, 2022 - proceedings.neurips.cc
To enable large-scale machine learning in bandwidth-hungry environments such as
wireless networks, significant progress has been made recently in designing communication …

Momentum provably improves error feedback!

I Fatkhullin, A Tyurin, P Richtárik - Advances in Neural …, 2023 - proceedings.neurips.cc
Due to the high communication overhead when training machine learning models in a
distributed environment, modern algorithms invariably rely on lossy communication …

BEER: Fast Rate for Decentralized Nonconvex Optimization with Communication Compression

H Zhao, B Li, Z Li, P Richtárik… - Advances in Neural …, 2022 - proceedings.neurips.cc
Communication efficiency has been widely recognized as the bottleneck for large-scale
decentralized machine learning applications in multi-agent or federated environments. To …

EF21-P and friends: Improved theoretical communication complexity for distributed optimization with bidirectional compression

K Gruntkowska, A Tyurin… - … Conference on Machine …, 2023 - proceedings.mlr.press
In this work we focus our attention on distributed optimization problems in the context where
the communication time between the server and the workers is non-negligible. We obtain …

DoCoFL: Downlink compression for cross-device federated learning

R Dorfman, S Vargaftik… - … on Machine Learning, 2023 - proceedings.mlr.press
Many compression techniques have been proposed to reduce the communication overhead
of Federated Learning training procedures. However, these are typically designed for …

Analysis of error feedback in federated non-convex optimization with biased compression: Fast convergence and partial participation

X Li, P Li - International Conference on Machine Learning, 2023 - proceedings.mlr.press
In practical federated learning (FL) systems, the communication cost between the clients and
the central server can often be a bottleneck. In this paper, we focus on biased gradient …

Stochastic controlled averaging for federated learning with communication compression

X Huang, P Li, X Li - arxiv preprint arxiv:2308.08165, 2023 - arxiv.org
Communication compression, a technique aiming to reduce the information volume to be
transmitted over the air, has gained great interests in Federated Learning (FL) for the …

Lower bounds and nearly optimal algorithms in distributed learning with communication compression

X Huang, Y Chen, W Yin… - Advances in Neural …, 2022 - proceedings.neurips.cc
Recent advances in distributed optimization and learning have shown that communication
compression is one of the most effective means of reducing communication. While there …