- Academic Search

Z Yao, A Gholami, K Keutzer… - 2020 IEEE international …, 2020 - ieeexplore.ieee.org

We present PYHESSIAN, a new scalable framework that enables fast computation of
Hessian (ie, second-order derivative) information for deep neural networks. PYHESSIAN …

Save Cite Cited by 326 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] jmlr.org

Second-order stochastic optimization for machine learning in linear time

N Agarwal, B Bullins, E Hazan - Journal of Machine Learning Research, 2017 - jmlr.org

First-order stochastic methods are the state-of-the-art in large-scale machine learning
optimization owing to efficient per-iteration complexity. Second-order methods, while able to …

Save Cite Cited by 294 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Newton-type methods for non-convex optimization under inexact Hessian information

P Xu, F Roosta, MW Mahoney - Mathematical Programming, 2020 - Springer

We consider variants of trust-region and adaptive cubic regularization methods for non-
convex optimization, in which the Hessian matrix is approximated. Under certain condition …

Save Cite Cited by 238 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Shampoo: Preconditioned stochastic tensor optimization

V Gupta, T Koren, Y Singer - International Conference on …, 2018 - proceedings.mlr.press

Preconditioned gradient methods are among the most general and powerful tools in
optimization. However, preconditioning requires storing and manipulating prohibitively large …

Save Cite Cited by 213 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] oup.com

Exact and inexact subsampled Newton methods for optimization

R Bollapragada, RH Byrd… - IMA Journal of Numerical …, 2019 - academic.oup.com

The paper studies the solution of stochastic optimization problems in which approximations
to the gradient and Hessian are obtained through subsampling. We first consider Newton …

Save Cite Cited by 208 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Stochastic block BFGS: Squeezing more curvature out of data

R Gower, D Goldfarb… - … Conference on Machine …, 2016 - proceedings.mlr.press

We propose a novel limited-memory stochastic block BFGS update for incorporating
enriched curvature information in stochastic approximation methods. In our method, the …

Save Cite Cited by 200 Related articles All 14 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Faster differentially private convex optimization via second-order methods

A Ganesh, M Haghifam, T Steinke… - Advances in Neural …, 2024 - proceedings.neurips.cc

Differentially private (stochastic) gradient descent is the workhorse of DP private machine
learning in both the convex and non-convex settings. Without privacy constraints, second …

Save Cite Cited by 12 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] springer.com

An overview of stochastic quasi-Newton methods for large-scale machine learning

TD Guo, Y Liu, CY Han - Journal of the Operations Research Society of …, 2023 - Springer

Numerous intriguing optimization problems arise as a result of the advancement of machine
learning. The stochastic first-order method is the predominant choice for those problems due …

Save Cite Cited by 16 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Sub-sampled cubic regularization for non-convex optimization

JM Kohler, A Lucchi - International Conference on Machine …, 2017 - proceedings.mlr.press

We consider the minimization of non-convex functions that typically arise in machine
learning. Specifically, we focus our attention on a variant of trust region methods known as …

Save Cite Cited by 205 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] berkeley.edu

Sub-sampled Newton methods

F Roosta-Khorasani, MW Mahoney - Mathematical Programming, 2019 - Springer

For large-scale finite-sum minimization problems, we study non-asymptotic and high-
probability global as well as local convergence properties of variants of Newton's method …

Save Cite Cited by 162 Related articles All 8 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Convergence rates of sub-sampled Newton methods

Pyhessian: Neural networks through the lens of the hessian

Second-order stochastic optimization for machine learning in linear time

Newton-type methods for non-convex optimization under inexact Hessian information

Shampoo: Preconditioned stochastic tensor optimization

Exact and inexact subsampled Newton methods for optimization

Stochastic block BFGS: Squeezing more curvature out of data

Faster differentially private convex optimization via second-order methods

An overview of stochastic quasi-Newton methods for large-scale machine learning

Sub-sampled cubic regularization for non-convex optimization

Sub-sampled Newton methods