Variance-reduced methods for machine learning
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
[HTML][HTML] Apple pomace, a bioresource of functional and nutritional components with potential of utilization in different food formulations: A review
Apple pomace is a substantial by-product created during the production of apple juice.
Apple pomace is commonly thrown away as waste, which harms the environment and could …
Apple pomace is commonly thrown away as waste, which harms the environment and could …
Recent theoretical advances in non-convex optimization
Motivated by recent increased interest in optimization algorithms for non-convex
optimization in application to training deep neural networks and other optimization problems …
optimization in application to training deep neural networks and other optimization problems …
A hybrid stochastic optimization framework for composite nonconvex optimization
We introduce a new approach to develop stochastic optimization algorithms for a class of
stochastic composite and possibly nonconvex optimization problems. The main idea is to …
stochastic composite and possibly nonconvex optimization problems. The main idea is to …
Sgd converges to global minimum in deep learning via star-convex path
Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a
variety of deep neural networks. However, there is still a lack of understanding on how and …
variety of deep neural networks. However, there is still a lack of understanding on how and …
Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions
We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …
functions satisfying gradient dominance property with $1\le\alpha\le2 $ which holds in a …
Stochastic subspace cubic Newton method
In this paper, we propose a new randomized second-order optimization algorithm—
Stochastic Subspace Cubic Newton (SSCN)—for minimizing a high dimensional convex …
Stochastic Subspace Cubic Newton (SSCN)—for minimizing a high dimensional convex …
Efficient hyper-parameter optimization with cubic regularization
As hyper-parameters are ubiquitous and can significantly affect the model performance,
hyper-parameter optimization is extremely important in machine learning. In this paper, we …
hyper-parameter optimization is extremely important in machine learning. In this paper, we …
Adaptive regularization with cubics on manifolds
Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex
optimization. Akin to the trust-region method, its iterations can be thought of as approximate …
optimization. Akin to the trust-region method, its iterations can be thought of as approximate …
Hessian averaging in stochastic Newton methods achieves superlinear convergence
We consider minimizing a smooth and strongly convex objective function using a stochastic
Newton method. At each iteration, the algorithm is given an oracle access to a stochastic …
Newton method. At each iteration, the algorithm is given an oracle access to a stochastic …