Nonconvex optimization meets low-rank matrix factorization: An overview

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org
Substantial progress has been made recently on develo** provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

Complete dictionary recovery over the sphere I: Overview and the geometric picture

J Sun, Q Qu, J Wright - IEEE Transactions on Information …, 2016 - ieeexplore.ieee.org
We consider the problem of recovering a complete (ie, square and invertible) matrix A 0,
from Y∈ R n× p with Y= A 0 X 0, provided X 0 is sufficiently sparse. This recovery problem is …

Learning single-index models with shallow neural networks

A Bietti, J Bruna, C Sanford… - Advances in neural …, 2022 - proceedings.neurips.cc
Single-index models are a class of functions given by an unknown univariate``link''function
applied to an unknown one-dimensional projection of the input. These models are …

A geometric analysis of neural collapse with unconstrained features

Z Zhu, T Ding, J Zhou, X Li, C You… - Advances in Neural …, 2021 - proceedings.neurips.cc
We provide the first global optimization landscape analysis of Neural Collapse--an intriguing
empirical phenomenon that arises in the last-layer classifiers and features of neural …

Implicit regularization in deep matrix factorization

S Arora, N Cohen, W Hu, Y Luo - Advances in neural …, 2019 - proceedings.neurips.cc
Efforts to understand the generalization mystery in deep learning have led to the belief that
gradient-based optimization induces a form of implicit regularization, a bias towards models …

Dying relu and initialization: Theory and numerical examples

L Lu, Y Shin, Y Su, GE Karniadakis - arxiv preprint arxiv:1903.06733, 2019 - arxiv.org
The dying ReLU refers to the problem when ReLU neurons become inactive and only output
0 for any input. There are many empirical and heuristic explanations of why ReLU neurons …

On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features

J Zhou, X Li, T Ding, C You, Q Qu… - … on Machine Learning, 2022 - proceedings.mlr.press
When training deep neural networks for classification tasks, an intriguing empirical
phenomenon has been widely observed in the last-layer classifiers and features, where (i) …

Lower bounds for non-convex stochastic optimization

Y Arjevani, Y Carmon, JC Duchi, DJ Foster… - Mathematical …, 2023 - Springer
We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ)
using stochastic first-order methods. In a well-studied model where algorithms access …

Smoothing the landscape boosts the signal for sgd: Optimal sample complexity for learning single index models

A Damian, E Nichani, R Ge… - Advances in Neural …, 2023 - proceedings.neurips.cc
We focus on the task of learning a single index model $\sigma (w^\star\cdot x) $ with respect
to the isotropic Gaussian distribution in $ d $ dimensions. Prior work has shown that the …

How to escape saddle points efficiently

C **, R Ge, P Netrapalli, SM Kakade… - … on machine learning, 2017 - proceedings.mlr.press
This paper shows that a perturbed form of gradient descent converges to a second-order
stationary point in a number iterations which depends only poly-logarithmically on …