Stochastic distributed learning with gradient quantization and double-variance reduction
We consider distributed optimization over several devices, each sending incremental model
updates to a central server. This setting is considered, for instance, in federated learning …
updates to a central server. This setting is considered, for instance, in federated learning …
Communication-efficient distributed optimization in networks with gradient tracking and variance reduction
There is growing interest in large-scale machine learning and optimization over
decentralized networks, eg in the context of multi-agent learning and federated learning …
decentralized networks, eg in the context of multi-agent learning and federated learning …
A linearly convergent algorithm for decentralized optimization: Sending less bits for free!
Decentralized optimization methods enable on-device training of machine learning models
without a central coordinator. In many scenarios communication between devices is energy …
without a central coordinator. In many scenarios communication between devices is energy …
Stochastic distributed optimization under average second-order similarity: Algorithms and analysis
We study finite-sum distributed optimization problems involving a master node and $ n-1$
local nodes under the popular $\delta $-similarity and $\mu $-strong convexity conditions …
local nodes under the popular $\delta $-similarity and $\mu $-strong convexity conditions …
Lower complexity bounds of finite-sum optimization problems: The results and construction
In this paper we study the lower complexity bounds for finite-sum optimization problems,
where the objective is the average of $ n $ individual component functions. We consider a …
where the objective is the average of $ n $ individual component functions. We consider a …
Accelerating value iteration with anchoring
Value Iteration (VI) is foundational to the theory and practice of modern reinforcement
learning, and it is known to converge at a $\mathcal {O}(\gamma^ k) $-rate. Surprisingly …
learning, and it is known to converge at a $\mathcal {O}(\gamma^ k) $-rate. Surprisingly …
Towards characterizing the first-order query complexity of learning (approximate) nash equilibria in zero-sum matrix games
In the first-order query model for zero-sum $ K\times K $ matrix games, players observe the
expected pay-offs for all their possible actions under the randomized action played by their …
expected pay-offs for all their possible actions under the randomized action played by their …
Variance reduction via primal-dual accelerated dual averaging for nonsmooth convex finite-sums
Structured nonsmooth convex finite-sum optimization appears in many machine learning
applications, including support vector machines and least absolute deviation. For the primal …
applications, including support vector machines and least absolute deviation. For the primal …
Variance reduction via accelerated dual averaging for finite-sum optimization
In this paper, we introduce a simplified and unified method for finite-sum convex
optimization, named\emph {Variance Reduction via Accelerated Dual Averaging (VRADA)} …
optimization, named\emph {Variance Reduction via Accelerated Dual Averaging (VRADA)} …
Acceleration of svrg and katyusha x by inexact preconditioning
Empirical risk minimization is an important class of optimization problems with many popular
machine learning applications, and stochastic variance reduction methods are popular …
machine learning applications, and stochastic variance reduction methods are popular …