Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Katyusha: The first direct acceleration of stochastic gradient methods
Z Allen-Zhu - Journal of Machine Learning Research, 2018 - jmlr.org
Nesterov's momentum trick is famously known for accelerating gradient descent, and has
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
Convex optimization: Algorithms and complexity
S Bubeck - Foundations and Trends® in Machine Learning, 2015 - nowpublishers.com
This monograph presents the main complexity theorems in convex optimization and their
corresponding algorithms. Starting from the fundamental theory of black-box optimization …
corresponding algorithms. Starting from the fundamental theory of black-box optimization …
A variational perspective on accelerated methods in optimization
Accelerated gradient methods play a central role in optimization, achieving optimal rates in
many settings. Although many generalizations and extensions of Nesterov's original …
many settings. Although many generalizations and extensions of Nesterov's original …
Understanding the acceleration phenomenon via high-resolution differential equations
Gradient-based optimization algorithms can be studied from the perspective of limiting
ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not …
ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not …
Acceleration methods
This monograph covers some recent advances in a range of acceleration techniques
frequently used in convex optimization. We first use quadratic optimization problems to …
frequently used in convex optimization. We first use quadratic optimization problems to …
Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule
Can we accelerate the convergence of gradient descent without changing the algorithm—
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …
Accelerated methods for nonconvex optimization
We present an accelerated gradient method for nonconvex optimization problems with
Lipschitz continuous first and second derivatives. In a time O(ϵ^-7/4\log(1/ϵ)), the method …
Lipschitz continuous first and second derivatives. In a time O(ϵ^-7/4\log(1/ϵ)), the method …
Accelerated gradient descent escapes saddle points faster than gradient descent
Nesterov's accelerated gradient descent (AGD), an instance of the general family of
“momentum methods,” provably achieves faster convergence rate than gradient descent …
“momentum methods,” provably achieves faster convergence rate than gradient descent …
A faster cutting plane method and its implications for combinatorial and convex optimization
In this paper we improve upon the running time for finding a point in a convex set given a
separation oracle. In particular, given a separation oracle for a convex set K⊂ R n that is …
separation oracle. In particular, given a separation oracle for a convex set K⊂ R n that is …
Linear coupling: An ultimate unification of gradient and mirror descent
First-order methods play a central role in large-scale machine learning. Even though many
variations exist, each suited to a particular problem, almost all such methods fundamentally …
variations exist, each suited to a particular problem, almost all such methods fundamentally …