Generalized Polyak step size for first order optimization with momentum
In machine learning applications, it is well known that carefully designed learning rate (step
size) schedules can significantly improve the convergence of commonly used first-order …
size) schedules can significantly improve the convergence of commonly used first-order …
Sharpness, restart and acceleration
The {\L} ojasiewicz inequality shows that H\" olderian error bounds on the minimum of
convex optimization problems hold almost generically. Here, we clarify results of\citet …
convex optimization problems hold almost generically. Here, we clarify results of\citet …
Faster first-order primal-dual methods for linear programming using restarts and sharpness
First-order primal-dual methods are appealing for their low memory overhead, fast iterations,
and effective parallelization. However, they are often slow at finding high accuracy solutions …
and effective parallelization. However, they are often slow at finding high accuracy solutions …
Rsg: Beating subgradient method without smoothness and strong convexity
In this paper, we study the efficiency of a Restarted SubGradient (RSG) method that
periodically restarts the standard subgradient method (SG). We show that, when applied to a …
periodically restarts the standard subgradient method (SG). We show that, when applied to a …
Perseus: A simple and optimal high-order method for variational inequalities
This paper settles an open and challenging question pertaining to the design of simple and
optimal high-order methods for solving smooth and monotone variational inequalities (VIs) …
optimal high-order methods for solving smooth and monotone variational inequalities (VIs) …
Scheduled restart momentum for accelerated stochastic gradient descent
Stochastic gradient descent (SGD) algorithms, with constant momentum and its variants
such as Adam, are the optimization methods of choice for training deep neural networks …
such as Adam, are the optimization methods of choice for training deep neural networks …
Stochastic algorithms with geometric step decay converge linearly on sharp functions
Stochastic (sub) gradient methods require step size schedule tuning to perform well in
practice. Classical tuning strategies decay the step size polynomially and lead to optimal …
practice. Classical tuning strategies decay the step size polynomially and lead to optimal …
A practical and optimal first-order method for large-scale convex quadratic programming
H Lu, J Yang - arxiv preprint arxiv:2311.07710, 2023 - arxiv.org
Convex quadratic programming (QP) is an important class of optimization problem with wide
applications in practice. The classic QP solvers are based on either simplex or barrier …
applications in practice. The classic QP solvers are based on either simplex or barrier …
Faster subgradient methods for functions with Hölderian growth
The purpose of this manuscript is to derive new convergence results for several subgradient
methods applied to minimizing nonsmooth convex functions with Hölderian growth. The …
methods applied to minimizing nonsmooth convex functions with Hölderian growth. The …
``Efficient” subgradient methods for general convex optimization
J Renegar - SIAM Journal on Optimization, 2016 - SIAM
A subgradient method is presented for solving general convex optimization problems, the
main requirement being that a strictly feasible point is known. A feasible sequence of iterates …
main requirement being that a strictly feasible point is known. A feasible sequence of iterates …