Variance-reduced methods for machine learning

RM Gower, M Schmidt, F Bach… - Proceedings of the …, 2020 - ieeexplore.ieee.org
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …

Stochastic nested variance reduction for nonconvex optimization

D Zhou, P Xu, Q Gu - Journal of machine learning research, 2020 - jmlr.org
We study nonconvex optimization problems, where the objective function is either an
average of n nonconvex functions or the expectation of some stochastic function. We …

Global convergence of Langevin dynamics based algorithms for nonconvex optimization

P Xu, J Chen, D Zou, Q Gu - Advances in Neural …, 2018 - proceedings.neurips.cc
We present a unified framework to analyze the global convergence of Langevin dynamics
based algorithms for nonconvex finite-sum optimization with $ n $ component functions. At …

Recent theoretical advances in non-convex optimization

M Danilova, P Dvurechensky, A Gasnikov… - … and Probability: With a …, 2022 - Springer
Motivated by recent increased interest in optimization algorithms for non-convex
optimization in application to training deep neural networks and other optimization problems …

Quantum speedups for stochastic optimization

A Sidford, C Zhang - Advances in Neural Information …, 2023 - proceedings.neurips.cc
We consider the problem of minimizing a continuous function given given access to a
natural quantum generalization of a stochastic gradient oracle. We provide two new …

Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions

S Masiha, S Salehkaleybar, N He… - Advances in …, 2022 - proceedings.neurips.cc
We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …

Distributed learning systems with first-order methods

J Liu, C Zhang - Foundations and Trends® in Databases, 2020 - nowpublishers.com
Scalable and efficient distributed learning is one of the main driving forces behind the recent
rapid advancement of machine learning and artificial intelligence. One prominent feature of …

Finding second-order stationary points in nonconvex-strongly-concave minimax optimization

L Luo, Y Li, C Chen - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We study the smooth minimax optimization problem $\min_ {\bf x}\max_ {\bf y} f ({\bf x},{\bf y})
$, where $ f $ is $\ell $-smooth, strongly-concave in ${\bf y} $ but possibly nonconvex in ${\bf …

Knowledge removal in sampling-based bayesian inference

S Fu, F He, D Tao - arxiv preprint arxiv:2203.12964, 2022 - arxiv.org
The right to be forgotten has been legislated in many countries, but its enforcement in the AI
industry would cause unbearable costs. When single data deletion requests come …

Adaptive regularization with cubics on manifolds

N Agarwal, N Boumal, B Bullins, C Cartis - Mathematical Programming, 2021 - Springer
Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex
optimization. Akin to the trust-region method, its iterations can be thought of as approximate …