Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Stochastic gradient descent and its variants in machine learning
P Netrapalli - Journal of the Indian Institute of Science, 2019 - Springer
Stochastic Gradient Descent and Its Variants in Machine Learning | Journal of the Indian Institute
of Science Skip to main content Springer Nature Link Account Menu Find a journal Publish with …
of Science Skip to main content Springer Nature Link Account Menu Find a journal Publish with …
Convergence of adam under relaxed assumptions
In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …
A survey of optimization methods from a machine learning perspective
Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …
widely applied in various fields. Optimization, as an important part of machine learning, has …
Lower bounds for non-convex stochastic optimization
We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ)
using stochastic first-order methods. In a well-studied model where algorithms access …
using stochastic first-order methods. In a well-studied model where algorithms access …
Momentum-based variance reduction in non-convex sgd
Variance reduction has emerged in recent years as a strong competitor to stochastic
gradient descent in non-convex problems, providing the first algorithms to improve upon the …
gradient descent in non-convex problems, providing the first algorithms to improve upon the …
Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator
In this paper, we propose a new technique named\textit {Stochastic Path-Integrated
Differential EstimatoR}(SPIDER), which can be used to track many deterministic quantities of …
Differential EstimatoR}(SPIDER), which can be used to track many deterministic quantities of …
Adaptive methods for nonconvex optimization
Adaptive gradient methods that rely on scaling gradients down by the square root of
exponential moving averages of past squared gradients, such RMSProp, Adam, Adadelta …
exponential moving averages of past squared gradients, such RMSProp, Adam, Adadelta …
Stochastic variance reduction for nonconvex optimization
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient
(SVRG) methods for them. SVRG and related methods have recently surged into …
(SVRG) methods for them. SVRG and related methods have recently surged into …
Katyusha: The first direct acceleration of stochastic gradient methods
Z Allen-Zhu - Journal of Machine Learning Research, 2018 - jmlr.org
Nesterov's momentum trick is famously known for accelerating gradient descent, and has
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
been proven useful in building fast iterative algorithms. However, in the stochastic setting …
Weakly-convex–concave min–max optimization: provable algorithms and applications in machine learning
Min–max problems have broad applications in machine learning, including learning with
non-decomposable loss and learning with robustness to data distribution. Convex–concave …
non-decomposable loss and learning with robustness to data distribution. Convex–concave …