Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Nonconvex optimization meets low-rank matrix factorization: An overview
Substantial progress has been made recently on develo** provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …
Complete dictionary recovery over the sphere I: Overview and the geometric picture
We consider the problem of recovering a complete (ie, square and invertible) matrix A 0,
from Y∈ R n× p with Y= A 0 X 0, provided X 0 is sufficiently sparse. This recovery problem is …
from Y∈ R n× p with Y= A 0 X 0, provided X 0 is sufficiently sparse. This recovery problem is …
Learning single-index models with shallow neural networks
Single-index models are a class of functions given by an unknown univariate``link''function
applied to an unknown one-dimensional projection of the input. These models are …
applied to an unknown one-dimensional projection of the input. These models are …
A geometric analysis of neural collapse with unconstrained features
We provide the first global optimization landscape analysis of Neural Collapse--an intriguing
empirical phenomenon that arises in the last-layer classifiers and features of neural …
empirical phenomenon that arises in the last-layer classifiers and features of neural …
Implicit regularization in deep matrix factorization
Efforts to understand the generalization mystery in deep learning have led to the belief that
gradient-based optimization induces a form of implicit regularization, a bias towards models …
gradient-based optimization induces a form of implicit regularization, a bias towards models …
Dying relu and initialization: Theory and numerical examples
The dying ReLU refers to the problem when ReLU neurons become inactive and only output
0 for any input. There are many empirical and heuristic explanations of why ReLU neurons …
0 for any input. There are many empirical and heuristic explanations of why ReLU neurons …
On the optimization landscape of neural collapse under mse loss: Global optimality with unconstrained features
When training deep neural networks for classification tasks, an intriguing empirical
phenomenon has been widely observed in the last-layer classifiers and features, where (i) …
phenomenon has been widely observed in the last-layer classifiers and features, where (i) …
Lower bounds for non-convex stochastic optimization
We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ)
using stochastic first-order methods. In a well-studied model where algorithms access …
using stochastic first-order methods. In a well-studied model where algorithms access …
Smoothing the landscape boosts the signal for sgd: Optimal sample complexity for learning single index models
We focus on the task of learning a single index model $\sigma (w^\star\cdot x) $ with respect
to the isotropic Gaussian distribution in $ d $ dimensions. Prior work has shown that the …
to the isotropic Gaussian distribution in $ d $ dimensions. Prior work has shown that the …
How to escape saddle points efficiently
This paper shows that a perturbed form of gradient descent converges to a second-order
stationary point in a number iterations which depends only poly-logarithmically on …
stationary point in a number iterations which depends only poly-logarithmically on …