Deep Learning and Geometric Deep Learning: An introduction for mathematicians and physicists

R Fioresi, F Zanchetta - … Journal of Geometric Methods in Modern …, 2023 - World Scientific
In this expository paper, we want to give a brief introduction, with few key references for
further reading, to the inner functioning of the new and successful algorithms of Deep …

Toward large kernel models

A Abedsoltan, M Belkin… - … Conference on Machine …, 2023 - proceedings.mlr.press
Recent studies indicate that kernel machines can often perform similarly or better than deep
neural networks (DNNs) on small datasets. The interest in kernel machines has been …

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

L Zhu, C Liu, A Radhakrishnan, M Belkin - arxiv preprint arxiv:2306.04815, 2023 - arxiv.org
In this paper, we first present an explanation regarding the common occurrence of spikes in
the training loss when neural networks are trained with stochastic gradient descent (SGD) …

On emergence of clean-priority learning in early stopped neural networks

C Liu, A Abedsoltan, M Belkin - arxiv preprint arxiv:2306.02533, 2023 - arxiv.org
When random label noise is added to a training dataset, the prediction error of a neural
network on a label-noise-free test dataset initially improves during early training but …

Toward Understanding the Dynamics of Over-parameterized Neural Networks

L Zhu - 2024 - search.proquest.com
The practical applications of neural networks are vast and varied, yet a comprehensive
understanding of their underlying principles remains incomplete. This dissertation advances …

Mechanism of clean-priority learning in early stopped neural networks of infinite width

C Liu, A Abedsoltan, M Belkin - openreview.net
When random label noise is added to a training dataset, the prediction error of a neural
network on a label-noise-free test dataset initially improves during early training but …