Machine learning and the physical sciences
Machine learning (ML) encompasses a broad range of algorithms and modeling tools used
for a vast array of data processing tasks, which has entered most scientific disciplines in …
for a vast array of data processing tasks, which has entered most scientific disciplines in …
Statistical mechanics of deep learning
The recent striking success of deep neural networks in machine learning raises profound
questions about the theoretical principles underlying their success. For example, what can …
questions about the theoretical principles underlying their success. For example, what can …
Xnor-net: Imagenet classification using binary convolutional neural networks
We propose two efficient approximations to standard convolutional neural networks: Binary-
Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are …
Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are …
Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1
We introduce a method to train Binarized Neural Networks (BNNs)-neural networks with
binary weights and activations at run-time. At training-time the binary weights and activations …
binary weights and activations at run-time. At training-time the binary weights and activations …
Quantized neural networks: Training neural networks with low precision weights and activations
The principal submatrix localization problem deals with recovering a K× K principal
submatrix of elevated mean µ in a large n× n symmetric matrix subject to additive standard …
submatrix of elevated mean µ in a large n× n symmetric matrix subject to additive standard …
Binarized neural networks
We introduce a method to train Binarized Neural Networks (BNNs)-neural networks with
binary weights and activations at run-time. At train-time the binary weights and activations …
binary weights and activations at run-time. At train-time the binary weights and activations …
Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data
GK Dziugaite, DM Roy - arxiv preprint arxiv:1703.11008, 2017 - arxiv.org
One of the defining properties of deep learning is that models are chosen to have many
more parameters than available training data. In light of this capacity for overfitting, it is …
more parameters than available training data. In light of this capacity for overfitting, it is …
Entropy-sgd: Biasing gradient descent into wide valleys
This paper proposes a new optimization algorithm called Entropy-SGD for training deep
neural networks that is motivated by the local geometry of the energy landscape. Local …
neural networks that is motivated by the local geometry of the energy landscape. Local …
Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
P Chaudhari, S Soatto - 2018 Information Theory and …, 2018 - ieeexplore.ieee.org
Stochastic gradient descent (SGD) is widely believed to perform implicit regularization when
used to train deep neural networks, but the precise manner in which this occurs has thus far …
used to train deep neural networks, but the precise manner in which this occurs has thus far …
Non-vacuous generalization bounds at the imagenet scale: a PAC-bayesian compression approach
Modern neural networks are highly overparameterized, with capacity to substantially overfit
to training data. Nevertheless, these networks often generalize well in practice. It has also …
to training data. Nevertheless, these networks often generalize well in practice. It has also …