- Academic Search

H Karimi, J Nutini, M Schmidt - Joint European conference on machine …, 2016 - Springer

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …

Lagre Referanse Sitert av 1428 Beslektede artikler Alle 10 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the convergence of decentralized gradient descent

K Yuan, Q Ling, W Yin - SIAM Journal on Optimization, 2016 - SIAM

Consider the consensus problem of minimizing f(x)=i=1^nf_i(x), where x∈R^p and each f_i
is only known to the individual agent i in a connected network of n agents. To solve this …

Lagre Referanse Sitert av 786 Beslektede artikler Alle 15 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm

D Needell, R Ward, N Srebro - Advances in neural …, 2014 - proceedings.neurips.cc

We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for
smooth and strongly convex objectives, reducing a quadratic dependence on the strong …

Lagre Referanse Sitert av 736 Beslektede artikler Alle 19 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems

J Yang, N Kiyavash, N He - Advances in Neural Information …, 2020 - proceedings.neurips.cc

Nonconvex minimax problems appear frequently in emerging machine learning
applications, such as generative adversarial networks and adversarial learning. Simple …

Lagre Referanse Sitert av 121 Beslektede artikler Alle 10 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing

J **, Z Li, K Lyu, SS Du, JD Lee - … Conference on Machine …, 2023 - proceedings.mlr.press

It is believed that Gradient Descent (GD) induces an implicit bias towards good
generalization in training machine learning models. This paper provides a fine-grained …

Lagre Referanse Sitert av 40 Beslektede artikler Alle 9 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Convergence rates for the stochastic gradient descent method for non-convex objective functions

B Fehrman, B Gess, A Jentzen - Journal of Machine Learning Research, 2020 - jmlr.org

We prove the convergence to minima and estimates on the rate of convergence for the
stochastic gradient descent method in the case of not necessarily locally convex nor …

Lagre Referanse Sitert av 127 Beslektede artikler Alle 17 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

The implicit regularization of stochastic gradient flow for least squares

A Ali, E Dobriban, R Tibshirani - International conference on …, 2020 - proceedings.mlr.press

We study the implicit regularization of mini-batch stochastic gradient descent, when applied
to the fundamental problem of least squares regression. We leverage a continuous-time …

Lagre Referanse Sitert av 105 Beslektede artikler Alle 12 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

On the lower bound of minimizing polyak-łojasiewicz functions

P Yue, C Fang, Z Lin - The Thirty Sixth Annual Conference …, 2023 - proceedings.mlr.press

Abstract Polyak-Łojasiewicz (PL)(Polyak, 1963) condition is a weaker condition than the
strong convexity but suffices to ensure a global convergence for the Gradient Descent …

Lagre Referanse Sitert av 41 Beslektede artikler Alle 5 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Sgd for structured nonconvex functions: Learning rates, minibatching and interpolation

R Gower, O Sebbouh, N Loizou - … Conference on Artificial …, 2021 - proceedings.mlr.press

Abstract Stochastic Gradient Descent (SGD) is being used routinely for optimizing non-
convex functions. Yet, the standard convergence theory for SGD in the smooth non-convex …

Lagre Referanse Sitert av 91 Beslektede artikler Alle 10 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On exponential convergence of sgd in non-convex over-parametrized learning

R Bassily, M Belkin, S Ma - arxiv preprint arxiv:1811.02564, 2018 - arxiv.org

Large over-parametrized models learned via stochastic gradient descent (SGD) methods
have become a key element in modern machine learning. Although SGD methods are very …

Lagre Referanse Sitert av 120 Beslektede artikler Alle 3 versjoner HTML-versjon

Opprett varsel

Referanse

Avansert søk

Lagret i Mitt bibliotek

Gradient methods for convex minimization: better rates under weaker conditions

Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition

On the convergence of decentralized gradient descent

Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm

Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems

Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing

Convergence rates for the stochastic gradient descent method for non-convex objective functions

The implicit regularization of stochastic gradient flow for least squares

On the lower bound of minimizing polyak-łojasiewicz functions

Sgd for structured nonconvex functions: Learning rates, minibatching and interpolation

On exponential convergence of sgd in non-convex over-parametrized learning