Federated learning of a mixture of global and local models
We propose a new optimization formulation for training federated learning models. The
standard formulation has the form of an empirical risk minimization problem constructed to …
standard formulation has the form of an empirical risk minimization problem constructed to …
Lower bounds and optimal algorithms for personalized federated learning
In this work, we consider the optimization formulation of personalized federated learning
recently introduced by Hanzely & Richtarik (2020) which was shown to give an alternative …
recently introduced by Hanzely & Richtarik (2020) which was shown to give an alternative …
PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization
In this paper, we propose a novel stochastic gradient estimator—ProbAbilistic Gradient
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …
Estimator (PAGE)—for nonconvex optimization. PAGE is easy to implement as it is designed …
Acceleration for compressed gradient descent in distributed and federated optimization
Due to the high communication cost in distributed and federated learning problems,
methods relying on compression of communicated messages are becoming increasingly …
methods relying on compression of communicated messages are becoming increasingly …
Stochastic gradient descent-ascent: Unified theory and new efficient methods
Abstract Stochastic Gradient Descent-Ascent (SGDA) is one of the most prominent
algorithms for solving min-max optimization and variational inequalities problems (VIP) …
algorithms for solving min-max optimization and variational inequalities problems (VIP) …
Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top
Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in
collaborative and federated learning. However, many fruitful directions, such as the usage of …
collaborative and federated learning. However, many fruitful directions, such as the usage of …
Stochastic hamiltonian gradient methods for smooth games
The success of adversarial formulations in machine learning has brought renewed
motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian …
motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian …
Error compensated distributed SGD can be accelerated
Gradient compression is a recent and increasingly popular technique for reducing the
communication cost in distributed training of large-scale machine learning models. In this …
communication cost in distributed training of large-scale machine learning models. In this …
Lower complexity bounds of finite-sum optimization problems: The results and construction
In this paper we study the lower complexity bounds for finite-sum optimization problems,
where the objective is the average of $ n $ individual component functions. We consider a …
where the objective is the average of $ n $ individual component functions. We consider a …
An optimal algorithm for decentralized finite-sum optimization
Modern large-scale finite-sum optimization relies on two key aspects: distribution and
stochastic updates. For smooth and strongly convex problems, existing decentralized …
stochastic updates. For smooth and strongly convex problems, existing decentralized …