Variance-reduced methods for machine learning

RM Gower, M Schmidt, F Bach… - Proceedings of the …, 2020 - ieeexplore.ieee.org
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …

[HTML][HTML] Apple pomace, a bioresource of functional and nutritional components with potential of utilization in different food formulations: A review

S Kauser, MA Murtaza, A Hussain, M Imran… - Food Chemistry …, 2024 - Elsevier
Apple pomace is a substantial by-product created during the production of apple juice.
Apple pomace is commonly thrown away as waste, which harms the environment and could …

Recent theoretical advances in non-convex optimization

M Danilova, P Dvurechensky, A Gasnikov… - … and Probability: With a …, 2022 - Springer
Motivated by recent increased interest in optimization algorithms for non-convex
optimization in application to training deep neural networks and other optimization problems …

A hybrid stochastic optimization framework for composite nonconvex optimization

Q Tran-Dinh, NH Pham, DT Phan… - Mathematical Programming, 2022 - Springer
We introduce a new approach to develop stochastic optimization algorithms for a class of
stochastic composite and possibly nonconvex optimization problems. The main idea is to …

Sgd converges to global minimum in deep learning via star-convex path

Y Zhou, J Yang, H Zhang, Y Liang, V Tarokh - arxiv preprint arxiv …, 2019 - arxiv.org
Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a
variety of deep neural networks. However, there is still a lack of understanding on how and …

Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions

S Masiha, S Salehkaleybar, N He… - Advances in …, 2022 - proceedings.neurips.cc
We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …

Stochastic subspace cubic Newton method

F Hanzely, N Doikov, Y Nesterov… - … on Machine Learning, 2020 - proceedings.mlr.press
In this paper, we propose a new randomized second-order optimization algorithm—
Stochastic Subspace Cubic Newton (SSCN)—for minimizing a high dimensional convex …

Efficient hyper-parameter optimization with cubic regularization

Z Shen, H Yang, Y Li, J Kwok… - Advances in Neural …, 2023 - proceedings.neurips.cc
As hyper-parameters are ubiquitous and can significantly affect the model performance,
hyper-parameter optimization is extremely important in machine learning. In this paper, we …

Adaptive regularization with cubics on manifolds

N Agarwal, N Boumal, B Bullins, C Cartis - Mathematical Programming, 2021 - Springer
Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex
optimization. Akin to the trust-region method, its iterations can be thought of as approximate …

Hessian averaging in stochastic Newton methods achieves superlinear convergence

S Na, M Dereziński, MW Mahoney - Mathematical Programming, 2023 - Springer
We consider minimizing a smooth and strongly convex objective function using a stochastic
Newton method. At each iteration, the algorithm is given an oracle access to a stochastic …