A high-bias, low-variance introduction to machine learning for physicists

P Mehta, M Bukov, CH Wang, AGR Day, C Richardson… - Physics reports, 2019 - Elsevier
Abstract Machine Learning (ML) is one of the most exciting and dynamic areas of modern
research and application. The purpose of this review is to provide an introduction to the core …

Statistical mechanics of deep learning

Y Bahri, J Kadmon, J Pennington… - Annual review of …, 2020 - annualreviews.org
The recent striking success of deep neural networks in machine learning raises profound
questions about the theoretical principles underlying their success. For example, what can …

Optimization for deep learning: theory and algorithms

R Sun - arxiv preprint arxiv:1912.08957, 2019 - arxiv.org
When and why can a neural network be successfully trained? This article provides an
overview of optimization algorithms and theory for training neural networks. First, we discuss …

Quantitative theory of magnetic interactions in solids

A Szilva, Y Kvashnin, EA Stepanov, L Nordström… - Reviews of Modern …, 2023 - APS
This review addresses the method of explicit calculations of interatomic exchange
interactions of magnetic materials. This involves exchange mechanisms normally referred to …

Extreme value statistics of correlated random variables: a pedagogical review

SN Majumdar, A Pal, G Schehr - Physics Reports, 2020 - Elsevier
Extreme value statistics (EVS) concerns the study of the statistics of the maximum or the
minimum of a set of random variables. This is an important problem for any time-series and …

Scaling description of generalization with number of parameters in deep learning

M Geiger, A Jacot, S Spigler, F Gabriel… - Journal of Statistical …, 2020 - iopscience.iop.org
Supervised deep learning involves the training of neural networks with a large number N of
parameters. For large enough N, in the so-called over-parametrized regime, one can …

A tail-index analysis of stochastic gradient noise in deep neural networks

U Simsekli, L Sagun… - … Conference on Machine …, 2019 - proceedings.mlr.press
The gradient noise (GN) in the stochastic gradient descent (SGD) algorithm is often
considered to be Gaussian in the large data regime by assuming that the classical central …

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

Implicit self-regularization in deep neural networks: Evidence from random matrix theory and implications for learning

CH Martin, MW Mahoney - Journal of Machine Learning Research, 2021 - jmlr.org
Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural
Networks (DNNs), including both production quality, pre-trained models such as AlexNet …

Disentangling feature and lazy training in deep neural networks

M Geiger, S Spigler, A Jacot… - Journal of Statistical …, 2020 - iopscience.iop.org
Two distinct limits for deep learning have been derived as the network width h→∞,
depending on how the weights of the last layer scale with h. In the neural tangent Kernel …