Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule
Can we accelerate the convergence of gradient descent without changing the algorithm—
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …
Provably faster gradient descent via long steps
B Grimmer - SIAM Journal on Optimization, 2024 - SIAM
This work establishes new convergence guarantees for gradient descent in smooth convex
optimization via a computer-assisted analysis technique. Our theory allows nonconstant …
optimization via a computer-assisted analysis technique. Our theory allows nonconstant …
Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization
We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in
our companion paper directly applies to smooth (non-strongly) convex optimization …
our companion paper directly applies to smooth (non-strongly) convex optimization …
Accelerated objective gap and gradient norm convergence for gradient descent via long steps
This work considers gradient descent for L-smooth convex optimization with stepsizes larger
than the classic regime where descent can be ensured. The stepsize schedules considered …
than the classic regime where descent can be ensured. The stepsize schedules considered …
Accelerated gradient descent via long steps
Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps
periodically, gradient descent's state-of-the-art O (1/T) convergence guarantees can be …
periodically, gradient descent's state-of-the-art O (1/T) convergence guarantees can be …
Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult
Large learning rates, when applied to gradient descent for nonconvex optimization, yield
various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang …
various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang …
Accelerated gradient descent by concatenation of stepsize schedules
Z Zhang, R Jiang - arxiv preprint arxiv:2410.12395, 2024 - arxiv.org
This work considers stepsize schedules for gradient descent on smooth convex objectives.
We extend the existing literature and propose a unified technique for constructing stepsizes …
We extend the existing literature and propose a unified technique for constructing stepsizes …
Relaxed proximal point algorithm: Tight complexity bounds and acceleration without momentum
In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex
(possibly nonsmooth) optimization problems. We conduct a comprehensive study on three …
(possibly nonsmooth) optimization problems. We conduct a comprehensive study on three …
Anytime Acceleration of Gradient Descent
This work investigates stepsize-based acceleration of gradient descent with {\em anytime}
convergence guarantees. For smooth (non-strongly) convex optimization, we propose a …
convergence guarantees. For smooth (non-strongly) convex optimization, we propose a …
Gradient descent with adaptive stepsize converges (nearly) linearly under fourth-order growth
A prevalent belief among optimization specialists is that linear convergence of gradient
descent is contingent on the function growing quadratically away from its minimizers. In this …
descent is contingent on the function growing quadratically away from its minimizers. In this …