Recent advances in stochastic gradient descent in deep learning
In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …
tremendously motivating and hard problem. Among machine learning models, stochastic …
Distributed machine learning for wireless communication networks: Techniques, architectures, and applications
Distributed machine learning (DML) techniques, such as federated learning, partitioned
learning, and distributed reinforcement learning, have been increasingly applied to wireless …
learning, and distributed reinforcement learning, have been increasingly applied to wireless …
Federated learning: A survey on enabling technologies, protocols, and applications
This paper provides a comprehensive study of Federated Learning (FL) with an emphasis
on enabling software and hardware platforms, protocols, real-life applications and use …
on enabling software and hardware platforms, protocols, real-life applications and use …
Federated optimization: Distributed machine learning for on-device intelligence
We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …
learning, where the data defining the optimization are unevenly distributed over an …
SARAH: A novel method for machine learning problems using stochastic recursive gradient
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its
practical variant SARAH+, as a novel approach to the finite-sum minimization problems …
practical variant SARAH+, as a novel approach to the finite-sum minimization problems …
: Decentralized Training over Decentralized Data
While training a machine learning model using multiple workers, each of which collects data
from its own data source, it would be useful when the data collected from different workers …
from its own data source, it would be useful when the data collected from different workers …
Katyusha: The first direct acceleration of stochastic gradient methods
Z Allen-Zhu - Journal of Machine Learning Research, 2018 - jmlr.org
Nesterov's momentum trick is famously known for accelerating gradient descent, and has
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
Stochastic variance reduction for nonconvex optimization
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient
(SVRG) methods for them. SVRG and related methods have recently surged into …
(SVRG) methods for them. SVRG and related methods have recently surged into …
Variance-reduced methods for machine learning
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
Momentum and stochastic momentum for stochastic gradient, newton, proximal point and subspace descent methods
In this paper we study several classes of stochastic optimization algorithms enriched with
heavy ball momentum. Among the methods studied are: stochastic gradient descent …
heavy ball momentum. Among the methods studied are: stochastic gradient descent …