Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …
On the convergence of decentralized gradient descent
Consider the consensus problem of minimizing f(x)=i=1^nf_i(x), where x∈R^p and each f_i
is only known to the individual agent i in a connected network of n agents. To solve this …
is only known to the individual agent i in a connected network of n agents. To solve this …
Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for
smooth and strongly convex objectives, reducing a quadratic dependence on the strong …
smooth and strongly convex objectives, reducing a quadratic dependence on the strong …
Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems
Nonconvex minimax problems appear frequently in emerging machine learning
applications, such as generative adversarial networks and adversarial learning. Simple …
applications, such as generative adversarial networks and adversarial learning. Simple …
Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing
It is believed that Gradient Descent (GD) induces an implicit bias towards good
generalization in training machine learning models. This paper provides a fine-grained …
generalization in training machine learning models. This paper provides a fine-grained …
Convergence rates for the stochastic gradient descent method for non-convex objective functions
We prove the convergence to minima and estimates on the rate of convergence for the
stochastic gradient descent method in the case of not necessarily locally convex nor …
stochastic gradient descent method in the case of not necessarily locally convex nor …
The implicit regularization of stochastic gradient flow for least squares
We study the implicit regularization of mini-batch stochastic gradient descent, when applied
to the fundamental problem of least squares regression. We leverage a continuous-time …
to the fundamental problem of least squares regression. We leverage a continuous-time …
On the lower bound of minimizing polyak-łojasiewicz functions
Abstract Polyak-Łojasiewicz (PL)(Polyak, 1963) condition is a weaker condition than the
strong convexity but suffices to ensure a global convergence for the Gradient Descent …
strong convexity but suffices to ensure a global convergence for the Gradient Descent …
Sgd for structured nonconvex functions: Learning rates, minibatching and interpolation
Abstract Stochastic Gradient Descent (SGD) is being used routinely for optimizing non-
convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex …
convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex …
On exponential convergence of sgd in non-convex over-parametrized learning
Large over-parametrized models learned via stochastic gradient descent (SGD) methods
have become a key element in modern machine learning. Although SGD methods are very …
have become a key element in modern machine learning. Although SGD methods are very …