- Academic Search

ZQJ Xu, Y Zhang, T Luo - Communications on Applied Mathematics and …, 2024 - Springer

Understanding deep learning is increasingly emergent as it penetrates more and more into
industry and science. In recent years, a research line from Fourier analysis sheds light on …

Save Cite Cited by 75 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

On lazy training in differentiable programming

L Chizat, E Oyallon, F Bach - Advances in neural …, 2019 - proceedings.neurips.cc

In a series of recent theoretical works, it was shown that strongly over-parameterized neural
networks trained with gradient-based methods could converge exponentially fast to zero …

Save Cite Cited by 1032 Related articles All 14 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

The generalization error of random features regression: Precise asymptotics and the double descent curve

S Mei, A Montanari - Communications on Pure and Applied …, 2022 - Wiley Online Library

Deep learning methods operate in regimes that defy the traditional statistical mindset.
Neural network architectures often contain more parameters than training samples, and are …

Save Cite Cited by 706 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss

L Chizat, F Bach - Conference on learning theory, 2020 - proceedings.mlr.press

Neural networks trained to minimize the logistic (aka cross-entropy) loss with gradient-based
methods are observed to perform well in many supervised classification tasks. Towards …

Save Cite Cited by 398 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

The merged-staircase property: a necessary and nearly sufficient condition for sgd learning of sparse functions on two-layer neural networks

E Abbe, EB Adsera… - Conference on Learning …, 2022 - proceedings.mlr.press

It is currently known how to characterize functions that neural networks can learn with SGD
for two extremal parametrizations: neural networks in the linear regime, and neural networks …

Save Cite Cited by 128 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Landscape and training regimes in deep learning

M Geiger, L Petrini, M Wyart - Physics Reports, 2021 - Elsevier

Deep learning algorithms are responsible for a technological revolution in a variety of tasks
including image recognition or Go playing. Yet, why they work is not understood. Ultimately …

Save Cite Cited by 51 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Toward moderate overparameterization: Global convergence guarantees for training shallow neural networks

S Oymak, M Soltanolkotabi - IEEE Journal on Selected Areas in …, 2020 - ieeexplore.ieee.org

Many modern neural network architectures are trained in an overparameterized regime
where the parameters of the model exceed the size of the training dataset. Sufficiently …

Save Cite Cited by 384 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] pnas.org Full View

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

C Fang, H He, Q Long, WJ Su - Proceedings of the National …, 2021 - National Acad Sciences

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable,
optimization program, in a quest to better understand deep neural networks that are trained …

Save Cite Cited by 180 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Linearized two-layers neural networks in high dimension

B Ghorbani, S Mei, T Misiakiewicz, A Montanari - 2021 - projecteuclid.org

The Supplementary Material contains the proofs of Theorem 1 (a) in Appendix A, Theorem 1
(b) in Appendix B, Proposition 2 in Appendix C, Theorem 2 (b) in Appendix D and Theorem …

Save Cite Cited by 279 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

High-dimensional limit theorems for sgd: Effective dynamics and critical scaling

G Ben Arous, R Gheissari… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in
the high-dimensional regime. We prove limit theorems for the trajectories of summary …

Save Cite Cited by 72 Related articles All 12 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Mean field analysis of neural networks: A central limit theorem

Overview frequency principle/spectral bias in deep learning

On lazy training in differentiable programming

The generalization error of random features regression: Precise asymptotics and the double descent curve

Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss

The merged-staircase property: a necessary and nearly sufficient condition for sgd learning of sparse functions on two-layer neural networks

[HTML][HTML] Landscape and training regimes in deep learning

Toward moderate overparameterization: Global convergence guarantees for training shallow neural networks

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

Linearized two-layers neural networks in high dimension

High-dimensional limit theorems for sgd: Effective dynamics and critical scaling