Machine learning and the physical sciences

G Carleo, I Cirac, K Cranmer, L Daudet, M Schuld… - Reviews of Modern …, 2019‏ - APS
Machine learning (ML) encompasses a broad range of algorithms and modeling tools used
for a vast array of data processing tasks, which has entered most scientific disciplines in …

A transdisciplinary review of deep learning research and its relevance for water resources scientists

C Shen - Water Resources Research, 2018‏ - Wiley Online Library
Deep learning (DL), a new generation of artificial neural network research, has transformed
industries, daily lives, and various scientific disciplines in recent years. DL represents …

Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data

GK Dziugaite, DM Roy - arxiv preprint arxiv:1703.11008, 2017‏ - arxiv.org
One of the defining properties of deep learning is that models are chosen to have many
more parameters than available training data. In light of this capacity for overfitting, it is …

Entropy-sgd: Biasing gradient descent into wide valleys

P Chaudhari, A Choromanska, S Soatto… - Journal of Statistical …, 2019‏ - iopscience.iop.org
This paper proposes a new optimization algorithm called Entropy-SGD for training deep
neural networks that is motivated by the local geometry of the energy landscape. Local …

Statistical mechanics of deep learning

Y Bahri, J Kadmon, J Pennington… - Annual review of …, 2020‏ - annualreviews.org
The recent striking success of deep neural networks in machine learning raises profound
questions about the theoretical principles underlying their success. For example, what can …

Empirical analysis of the hessian of over-parametrized neural networks

L Sagun, U Evci, VU Guney, Y Dauphin… - arxiv preprint arxiv …, 2017‏ - arxiv.org
We study the properties of common loss surfaces through their Hessian matrix. In particular,
in the context of deep learning, we empirically show that the spectrum of the Hessian is …

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks

P Chaudhari, S Soatto - 2018 Information Theory and …, 2018‏ - ieeexplore.ieee.org
Stochastic gradient descent (SGD) is widely believed to perform implicit regularization when
used to train deep neural networks, but the precise manner in which this occurs has thus far …

Optimal errors and phase transitions in high-dimensional generalized linear models

J Barbier, F Krzakala, N Macris, L Miolane… - Proceedings of the …, 2019‏ - pnas.org
Generalized linear models (GLMs) are used in high-dimensional machine learning,
statistics, communications, and signal processing. In this paper we analyze GLMs when the …

Fast automated analysis of strong gravitational lenses with convolutional neural networks

YD Hezaveh, LP Levasseur, PJ Marshall - Nature, 2017‏ - nature.com
Quantifying image distortions caused by strong gravitational lensing—the formation of
multiple images of distant sources due to the deflection of their light by the gravity of …

Implicit self-regularization in deep neural networks: Evidence from random matrix theory and implications for learning

CH Martin, MW Mahoney - Journal of Machine Learning Research, 2021‏ - jmlr.org
Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural
Networks (DNNs), including both production quality, pre-trained models such as AlexNet …