Overview frequency principle/spectral bias in deep learning

ZQJ Xu, Y Zhang, T Luo - Communications on Applied Mathematics and …, 2024 - Springer
Understanding deep learning is increasingly emergent as it penetrates more and more into
industry and science. In recent years, a research line from Fourier analysis sheds light on …

The geometry of feature space in deep learning models: a holistic perspective and comprehensive review

M Lee - Mathematics, 2023 - mdpi.com
As the field of deep learning experiences a meteoric rise, the urgency to decipher the
complex geometric properties of feature spaces, which underlie the effectiveness of diverse …

Empirical phase diagram for three-layer neural networks with infinite width

H Zhou, Z Qixuan, Z **, T Luo… - Advances in Neural …, 2022 - proceedings.neurips.cc
Substantial work indicates that the dynamics of neural networks (NNs) is closely related to
their initialization of parameters. Inspired by the phase diagram for two-layer ReLU NNs with …

Embedding principle: a hierarchical structure of loss landscape of deep neural networks

Y Zhang, Y Li, Z Zhang, T Luo, ZQJ Xu - arxiv preprint arxiv:2111.15527, 2021 - arxiv.org
We prove a general Embedding Principle of loss landscape of deep neural networks (NNs)
that unravels a hierarchical structure of the loss landscape of NNs, ie, loss landscape of an …

Mathematical introduction to deep learning: methods, implementations, and theory

A Jentzen, B Kuckuck, P von Wurstemberger - arxiv preprint arxiv …, 2023 - arxiv.org
This book aims to provide an introduction to the topic of deep learning algorithms. We review
essential components of deep learning algorithms in full mathematical detail including …

Implicit regularization of dropout

Z Zhang, ZQJ Xu - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
It is important to understand how dropout, a popular regularization method, aids in achieving
a good generalization solution during neural network training. In this work, we present a …

Loss spike in training neural networks

X Li, ZQJ Xu, Z Zhang - arxiv preprint arxiv:2305.12133, 2023 - arxiv.org
In this work, we investigate the mechanism underlying loss spikes observed during neural
network training. When the training enters a region with a lower-loss-as-sharper (LLAS) …

On the existence of global minima and convergence analyses for gradient descent methods in the training of deep neural networks

A Jentzen, A Riekert - arxiv preprint arxiv:2112.09684, 2021 - arxiv.org
In this article we study fully-connected feedforward deep ReLU ANNs with an arbitrarily
large number of hidden layers and we prove convergence of the risk of the GD optimization …

Towards understanding the condensation of neural networks at initial training

H Zhou, Z Qixuan, T Luo, Y Zhang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Empirical works show that for ReLU neural networks (NNs) with small initialization, input
weights of hidden neurons (the input weight of a hidden neuron consists of the weight from …

Phase diagram of initial condensation for two-layer neural networks

Z Chen, Y Li, T Luo, Z Zhou, ZQJ Xu - arxiv preprint arxiv:2303.06561, 2023 - arxiv.org
The phenomenon of distinct behaviors exhibited by neural networks under varying scales of
initialization remains an enigma in deep learning research. In this paper, based on the …