- Academic Search

H Li, A Rakhlin, A Jadbabaie - Advances in Neural …, 2023 - proceedings.neurips.cc

In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …

Save Cite Cited by 60 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Faster non-convex federated learning via global and local momentum

R Das, A Acharya, A Hashemi… - Uncertainty in …, 2022 - proceedings.mlr.press

Abstract We propose\texttt {FedGLOMO}, a novel federated learning (FL) algorithm with an
iteration complexity of $\mathcal {O}(\epsilon^{-1.5}) $ to converge to an $\epsilon …

Save Cite Cited by 90 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Communication compression for byzantine robust learning: New efficient algorithms and improved rates

A Rammal, K Gruntkowska, N Fedin… - International …, 2024 - proceedings.mlr.press

Byzantine robustness is an essential feature of algorithms for certain distributed optimization
problems, typically encountered in collaborative/federated learning. These problems are …

Save Cite Cited by 4 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

DASHA: Distributed nonconvex optimization with communication compression, optimal oracle complexity, and no client synchronization

A Tyurin, P Richtárik - arxiv preprint arxiv:2202.01268, 2022 - arxiv.org

We develop and analyze DASHA: a new family of methods for nonconvex distributed
optimization problems. When the local functions at the nodes have a finite-sum or an …

Save Cite Cited by 29 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aaai.org

Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness

C **e, C Li, C Zhang, Q Deng, D Ge, Y Ye - Proceedings of the AAAI …, 2024 - ojs.aaai.org

In many important machine learning applications, the standard assumption of having a
globally Lipschitz continuous gradient may fail to hold. This paper delves into a more …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A stochastic proximal gradient framework for decentralized non-convex composite optimization: Topology-independent sample complexity and communication …

R **n, S Das, UA Khan, S Kar - arxiv preprint arxiv:2110.01594, 2021 - arxiv.org

Decentralized optimization is a promising parallel computation paradigm for large-scale
data analytics and machine learning problems defined over a network of nodes. This paper …

Save Cite Cited by 17 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Breaking the lower bound with (little) structure: Acceleration in non-convex stochastic optimization with heavy-tailed noise

Z Liu, J Zhang, Z Zhou - The Thirty Sixth Annual Conference …, 2023 - proceedings.mlr.press

In this paper, we consider the stochastic optimization problem with smooth but not
necessarily convex objectives in the heavy-tailed noise regime, where the stochastic …

Save Cite Cited by 13 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] kaust.edu.sa

Variance reduced distributed non-convex optimization using matrix stepsizes

H Li, A Karagulyan, P Richtárik - 2024 - repository.kaust.edu.sa

Matrix-stepsized gradient descent algorithms have been shown to have superior
performance in non-convex optimization problems compared to their scalar counterparts …

Save Cite Cited by 1 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Random-reshuffled SARAH does not need full gradient computations

A Beznosikov, M Takáč - Optimization Letters, 2024 - Springer

Abstract The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance
reduced variant of the Stochastic Gradient Descent algorithm that needs a gradient of the …

Save Cite Cited by 8 Related articles All 7 versions Free GPT-4

A Unified Model for Large-Scale Inexact Fixed-Point Iteration: A Stochastic Optimization Perspective

A Hashemi - IEEE Transactions on Automatic Control, 2024 - ieeexplore.ieee.org

Calculating fixed points of a nonlinear function is a central problem in many areas of science
and engineering with applications ranging from the study of dynamical systems to …

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

An optimal hybrid variance-reduced algorithm for stochastic composite nonconvex optimization

Convergence of adam under relaxed assumptions

Faster non-convex federated learning via global and local momentum

Communication compression for byzantine robust learning: New efficient algorithms and improved rates

DASHA: Distributed nonconvex optimization with communication compression, optimal oracle complexity, and no client synchronization

Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness

A stochastic proximal gradient framework for decentralized non-convex composite optimization: Topology-independent sample complexity and communication …

Breaking the lower bound with (little) structure: Acceleration in non-convex stochastic optimization with heavy-tailed noise

Variance reduced distributed non-convex optimization using matrix stepsizes

Random-reshuffled SARAH does not need full gradient computations

A Unified Model for Large-Scale Inexact Fixed-Point Iteration: A Stochastic Optimization Perspective