Generalized Polyak step size for first order optimization with momentum

X Wang, M Johansson, T Zhang - … Conference on Machine …, 2023 - proceedings.mlr.press
In machine learning applications, it is well known that carefully designed learning rate (step
size) schedules can significantly improve the convergence of commonly used first-order …

Sharpness, restart and acceleration

V Roulet, A d'Aspremont - Advances in Neural Information …, 2017 - proceedings.neurips.cc
The {\L} ojasiewicz inequality shows that H\" olderian error bounds on the minimum of
convex optimization problems hold almost generically. Here, we clarify results of\citet …

Faster first-order primal-dual methods for linear programming using restarts and sharpness

D Applegate, O Hinder, H Lu, M Lubin - Mathematical Programming, 2023 - Springer
First-order primal-dual methods are appealing for their low memory overhead, fast iterations,
and effective parallelization. However, they are often slow at finding high accuracy solutions …

Rsg: Beating subgradient method without smoothness and strong convexity

T Yang, Q Lin - Journal of Machine Learning Research, 2018 - jmlr.org
In this paper, we study the efficiency of a Restarted SubGradient (RSG) method that
periodically restarts the standard subgradient method (SG). We show that, when applied to a …

Perseus: A simple and optimal high-order method for variational inequalities

T Lin, MI Jordan - Mathematical Programming, 2024 - Springer
This paper settles an open and challenging question pertaining to the design of simple and
optimal high-order methods for solving smooth and monotone variational inequalities (VIs) …

Scheduled restart momentum for accelerated stochastic gradient descent

B Wang, T Nguyen, T Sun, AL Bertozzi… - SIAM Journal on Imaging …, 2022 - SIAM
Stochastic gradient descent (SGD) algorithms, with constant momentum and its variants
such as Adam, are the optimization methods of choice for training deep neural networks …

Stochastic algorithms with geometric step decay converge linearly on sharp functions

D Davis, D Drusvyatskiy, V Charisopoulos - Mathematical Programming, 2024 - Springer
Stochastic (sub) gradient methods require step size schedule tuning to perform well in
practice. Classical tuning strategies decay the step size polynomially and lead to optimal …

A practical and optimal first-order method for large-scale convex quadratic programming

H Lu, J Yang - arxiv preprint arxiv:2311.07710, 2023 - arxiv.org
Convex quadratic programming (QP) is an important class of optimization problem with wide
applications in practice. The classic QP solvers are based on either simplex or barrier …

Faster subgradient methods for functions with Hölderian growth

PR Johnstone, P Moulin - Mathematical Programming, 2020 - Springer
The purpose of this manuscript is to derive new convergence results for several subgradient
methods applied to minimizing nonsmooth convex functions with Hölderian growth. The …

``Efficient” subgradient methods for general convex optimization

J Renegar - SIAM Journal on Optimization, 2016 - SIAM
A subgradient method is presented for solving general convex optimization problems, the
main requirement being that a strictly feasible point is known. A feasible sequence of iterates …