Stochastic distributed learning with gradient quantization and double-variance reduction

S Horváth, D Kovalev, K Mishchenko… - Optimization Methods …, 2023 - Taylor & Francis
We consider distributed optimization over several devices, each sending incremental model
updates to a central server. This setting is considered, for instance, in federated learning …

Communication-efficient distributed optimization in networks with gradient tracking and variance reduction

B Li, S Cen, Y Chen, Y Chi - Journal of Machine Learning Research, 2020 - jmlr.org
There is growing interest in large-scale machine learning and optimization over
decentralized networks, eg in the context of multi-agent learning and federated learning …

A linearly convergent algorithm for decentralized optimization: Sending less bits for free!

D Kovalev, A Koloskova, M Jaggi… - International …, 2021 - proceedings.mlr.press
Decentralized optimization methods enable on-device training of machine learning models
without a central coordinator. In many scenarios communication between devices is energy …

Stochastic distributed optimization under average second-order similarity: Algorithms and analysis

D Lin, Y Han, H Ye, Z Zhang - Advances in Neural …, 2023 - proceedings.neurips.cc
We study finite-sum distributed optimization problems involving a master node and $ n-1$
local nodes under the popular $\delta $-similarity and $\mu $-strong convexity conditions …

Lower complexity bounds of finite-sum optimization problems: The results and construction

Y Han, G **e, Z Zhang - Journal of Machine Learning Research, 2024 - jmlr.org
In this paper we study the lower complexity bounds for finite-sum optimization problems,
where the objective is the average of $ n $ individual component functions. We consider a …

Accelerating value iteration with anchoring

J Lee, E Ryu - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc
Value Iteration (VI) is foundational to the theory and practice of modern reinforcement
learning, and it is known to converge at a $\mathcal {O}(\gamma^ k) $-rate. Surprisingly …

Towards characterizing the first-order query complexity of learning (approximate) nash equilibria in zero-sum matrix games

H Hadiji, S Sachs, T van Erven… - Advances in Neural …, 2023 - proceedings.neurips.cc
In the first-order query model for zero-sum $ K\times K $ matrix games, players observe the
expected pay-offs for all their possible actions under the randomized action played by their …

Variance reduction via primal-dual accelerated dual averaging for nonsmooth convex finite-sums

C Song, SJ Wright… - … Conference on Machine …, 2021 - proceedings.mlr.press
Structured nonsmooth convex finite-sum optimization appears in many machine learning
applications, including support vector machines and least absolute deviation. For the primal …

Variance reduction via accelerated dual averaging for finite-sum optimization

C Song, Y Jiang, Y Ma - Advances in Neural Information …, 2020 - proceedings.neurips.cc
In this paper, we introduce a simplified and unified method for finite-sum convex
optimization, named\emph {Variance Reduction via Accelerated Dual Averaging (VRADA)} …

Acceleration of svrg and katyusha x by inexact preconditioning

Y Liu, F Feng, W Yin - International Conference on Machine …, 2019 - proceedings.mlr.press
Empirical risk minimization is an important class of optimization problems with many popular
machine learning applications, and stochastic variance reduction methods are popular …