- Academic Search

I Fatkhullin, A Barakat, A Kireeva… - … Conference on Machine …, 2023 - proceedings.mlr.press

Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed
the development of their theoretical foundations. Despite the huge efforts directed at the …

Save Cite Cited by 44 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Momentum and stochastic momentum for stochastic gradient, newton, proximal point and subspace descent methods

N Loizou, P Richtárik - Computational Optimization and Applications, 2020 - Springer

In this paper we study several classes of stochastic optimization algorithms enriched with
heavy ball momentum. Among the methods studied are: stochastic gradient descent …

Save Cite Cited by 225 Related articles All 17 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Momentum improves normalized sgd

A Cutkosky, H Mehta - International conference on machine …, 2020 - proceedings.mlr.press

We provide an improved analysis of normalized SGD showing that adding momentum
provably removes the need for large batch sizes on non-convex objectives. Then, we …

Save Cite Cited by 131 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

High-probability bounds for non-convex stochastic optimization with heavy tails

A Cutkosky, H Mehta - Advances in Neural Information …, 2021 - proceedings.neurips.cc

We consider non-convex stochastic optimization using first-order algorithms for which the
gradient estimates may have heavy tails. We show that a combination of gradient clip** …

Save Cite Cited by 62 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

The marginal value of momentum for small learning rate sgd

R Wang, S Malladi, T Wang, K Lyu, Z Li - arxiv preprint arxiv:2307.15196, 2023 - arxiv.org

Momentum is known to accelerate the convergence of gradient descent in strongly convex
settings without stochastic gradient noise. In stochastic optimization, such as training neural …

Save Cite Cited by 8 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Momentum ensures convergence of signsgd under weaker assumptions

T Sun, Q Wang, D Li, B Wang - International Conference on …, 2023 - proceedings.mlr.press

Abstract Sign Stochastic Gradient Descent (signSGD) is a communication-efficient stochastic
algorithm that only uses the sign information of the stochastic gradient to update the model's …

Save Cite Cited by 13 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Meta learning on a sequence of imbalanced domains with difficulty awareness

Z Wang, T Duan, L Fang, Q Suo… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recognizing new objects by learning from a few labeled examples in an evolving
environment is crucial to obtain excellent generalization ability for real-world machine …

Save Cite Cited by 23 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] jmlr.org

Online optimization over riemannian manifolds

X Wang, Z Tu, Y Hong, Y Wu, G Shi - Journal of Machine Learning …, 2023 - jmlr.org

Online optimization has witnessed a massive surge of research attention in recent years. In
this paper, we propose online gradient descent and online bandit algorithms over …

Save Cite Cited by 8 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Riemannian optimistic algorithms

X Wang, D Yuan, Y Hong, Z Hu, L Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

In this paper, we consider Riemannian online convex optimization with dynamic regret. First,
we propose two novel algorithms, namely the Riemannian Online Optimistic Gradient …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Root-sgd: Sharp nonasymptotics and asymptotic efficiency in a single algorithm

CJ Li, W Mou, M Wainwright… - Conference on Learning …, 2022 - proceedings.mlr.press

We study the problem of solving strongly convex and smooth unconstrained optimization
problems using stochastic first-order algorithms. We devise a novel algorithm, referred to …

Save Cite Cited by 22 Related articles All 7 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Reducing the variance in online optimization by transporting past gradients

Stochastic policy gradient methods: Improved sample complexity for fisher-non-degenerate policies

Momentum and stochastic momentum for stochastic gradient, newton, proximal point and subspace descent methods

Momentum improves normalized sgd

High-probability bounds for non-convex stochastic optimization with heavy tails

The marginal value of momentum for small learning rate sgd

Momentum ensures convergence of signsgd under weaker assumptions

Meta learning on a sequence of imbalanced domains with difficulty awareness

Online optimization over riemannian manifolds

Riemannian optimistic algorithms

Root-sgd: Sharp nonasymptotics and asymptotic efficiency in a single algorithm