Robustness to unbounded smoothness of generalized signsgd

M Crawshaw, M Liu, F Orabona… - Advances in neural …, 2022 - proceedings.neurips.cc
Traditional analyses in non-convex optimization typically rely on the smoothness
assumption, namely requiring the gradients to be Lipschitz. However, recent evidence …

High-probability bounds for stochastic optimization and variational inequalities: the case of unbounded variance

A Sadiev, M Danilova, E Gorbunov… - International …, 2023 - proceedings.mlr.press
During the recent years the interest of optimization and machine learning communities in
high-probability convergence of stochastic optimization methods has been growing. One of …

High probability convergence of stochastic gradient methods

Z Liu, TD Nguyen, TH Nguyen… - … on Machine Learning, 2023 - proceedings.mlr.press
In this work, we describe a generic approach to show convergence with high probability for
both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous …

Improved convergence in high probability of clipped gradient methods with heavy tailed noise

TD Nguyen, TH Nguyen, A Ene… - Advances in Neural …, 2023 - proceedings.neurips.cc
Improved Convergence in High Probability of Clipped Gradient Methods with Heavy Tailed
Noise Page 1 Improved Convergence in High Probability of Clipped Gradient Methods with …

Momentum provably improves error feedback!

I Fatkhullin, A Tyurin, P Richtárik - Advances in Neural …, 2024 - proceedings.neurips.cc
Due to the high communication overhead when training machine learning models in a
distributed environment, modern algorithms invariably rely on lossy communication …

Clipped stochastic methods for variational inequalities with heavy-tailed noise

E Gorbunov, M Danilova, D Dobre… - Advances in …, 2022 - proceedings.neurips.cc
Stochastic first-order methods such as Stochastic Extragradient (SEG) or Stochastic Gradient
Descent-Ascent (SGDA) for solving smooth minimax problems and, more generally …

Methods for Convex -Smooth Optimization: Clip**, Acceleration, and Adaptivity

E Gorbunov, N Tupitsa, S Choudhury, A Aliev… - arxiv preprint arxiv …, 2024 - arxiv.org
Due to the non-smoothness of optimization problems in Machine Learning, generalized
smoothness assumptions have been gaining a lot of attention in recent years. One of the …

Federated learning with client subsampling, data heterogeneity, and unbounded smoothness: A new algorithm and lower bounds

M Crawshaw, Y Bao, M Liu - Advances in Neural …, 2024 - proceedings.neurips.cc
We study the problem of Federated Learning (FL) under client subsampling and data
heterogeneity with an objective function that has potentially unbounded smoothness. This …

SGD with AdaGrad stepsizes: Full adaptivity with high probability to unknown parameters, unbounded gradients and affine variance

A Attia, T Koren - International Conference on Machine …, 2023 - proceedings.mlr.press
Abstract We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive
(self-tuning) method for first-order stochastic optimization. Despite being well studied …

Parameter-free regret in high probability with heavy tails

J Zhang, A Cutkosky - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We present new algorithms for online convex optimization over unbounded domains that
obtain parameter-free regret in high-probability given access only to potentially heavy-tailed …