Fedvarp: Tackling the variance due to partial client participation in federated learning

D Jhunjhunwala, P Sharma… - Uncertainty in …, 2022 - proceedings.mlr.press
Data-heterogeneous federated learning (FL) systems suffer from two significant sources of
convergence error: 1) client drift error caused by performing multiple local optimization steps …

EF21 with bells & whistles: Practical algorithmic extensions of modern error feedback

I Fatkhullin, I Sokolov, E Gorbunov, Z Li… - ar** for non-convex optimization
A Reisizadeh, H Li, S Das, A Jadbabaie - ar** is a standard training technique used in deep learning applications such
as large-scale language modeling to mitigate exploding gradients. Recent experimental …

Decentralized stochastic gradient descent ascent for finite-sum minimax problems

H Gao - arxiv preprint arxiv:2212.02724, 2022 - arxiv.org
Minimax optimization problems have attracted significant attention in recent years due to
their widespread application in numerous machine learning models. To solve the minimax …

CANITA: Faster rates for distributed convex optimization with communication compression

Z Li, P Richtárik - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Due to the high communication cost in distributed and federated learning, methods relying
on compressed communication are becoming increasingly popular. Besides, the best …

DASHA: Distributed nonconvex optimization with communication compression, optimal oracle complexity, and no client synchronization

A Tyurin, P Richtárik - arxiv preprint arxiv:2202.01268, 2022 - arxiv.org
We develop and analyze DASHA: a new family of methods for nonconvex distributed
optimization problems. When the local functions at the nodes have a finite-sum or an …

FedPAGE: A fast local stochastic gradient method for communication-efficient federated learning

H Zhao, Z Li, P Richtárik - arxiv preprint arxiv:2108.04755, 2021 - arxiv.org
Federated Averaging (FedAvg, also known as Local-SGD)(McMahan et al., 2017) is a
classical federated learning algorithm in which clients run multiple local SGD steps before …

Simple and optimal stochastic gradient methods for nonsmooth nonconvex optimization

Z Li, J Li - Journal of Machine Learning Research, 2022 - jmlr.org
We propose and analyze several stochastic gradient algorithms for finding stationary points
or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online …

Jointly improving the sample and communication complexities in decentralized stochastic minimax optimization

X Zhang, G Mancino-Ball, NS Aybat, Y Xu - Proceedings of the AAAI …, 2024 - ojs.aaai.org
We propose a novel single-loop decentralized algorithm, DGDA-VR, for solving the
stochastic nonconvex strongly-concave minimax problems over a connected network of …

DESTRESS: Computation-optimal and communication-efficient decentralized nonconvex finite-sum optimization

B Li, Z Li, Y Chi - SIAM Journal on Mathematics of Data Science, 2022 - SIAM
Emerging applications in multiagent environments such as internet-of-things, networked
sensing, autonomous systems, and federated learning, call for decentralized algorithms for …