Learning and generalization in overparameterized neural networks, going beyond two layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Page 1 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two …
Page 1 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two …
A mean field view of the landscape of two-layer neural networks
Multilayer neural networks are among the most powerful models in machine learning, yet the
fundamental reasons for this success defy mathematical understanding. Learning a neural …
fundamental reasons for this success defy mathematical understanding. Learning a neural …
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
In this paper, we study the problem of learning a shallow artificial neural network that best
fits a training data set. We study this problem in the over-parameterized regime where the …
fits a training data set. We study this problem in the over-parameterized regime where the …
[HTML][HTML] A comparison of deep networks with ReLU activation function and linear spline-type methods
K Eckle, J Schmidt-Hieber - Neural Networks, 2019 - Elsevier
Deep neural networks (DNNs) generate much richer function spaces than shallow networks.
Since the function spaces induced by shallow networks have several approximation …
Since the function spaces induced by shallow networks have several approximation …
Rademacher complexity for adversarially robust generalization
Many machine learning models are vulnerable to adversarial attacks; for example, adding
adversarial perturbations that are imperceptible to humans can often make machine …
adversarial perturbations that are imperceptible to humans can often make machine …
Simple recurrent units for highly parallelizable recurrence
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in
parallelizing their state computations. In this work, we propose the Simple Recurrent Unit …
parallelizing their state computations. In this work, we propose the Simple Recurrent Unit …
Beyond sparsity: Tree regularization of deep models for interpretability
The lack of interpretability remains a key barrier to the adoption of deep models in many
applications. In this work, we explicitly regularize deep models so human users might step …
applications. In this work, we explicitly regularize deep models so human users might step …
Recovery guarantees for one-hidden-layer neural networks
In this paper, we consider regression problems with one-hidden-layer neural networks
(1NNs). We distill some properties of activation functions that lead to local strong convexity …
(1NNs). We distill some properties of activation functions that lead to local strong convexity …
Learning one-hidden-layer neural networks with landscape design
We consider the problem of learning a one-hidden-layer neural network: we assume the
input $ x\in\mathbb {R}^ d $ is from Gaussian distribution and the label $ y= a^\top\sigma …
input $ x\in\mathbb {R}^ d $ is from Gaussian distribution and the label $ y= a^\top\sigma …
What can resnet learn efficiently, going beyond kernels?
How can neural networks such as ResNet\emph {efficiently} learn CIFAR-10 with test
accuracy more than $96\% $, while other methods, especially kernel methods, fall relatively …
accuracy more than $96\% $, while other methods, especially kernel methods, fall relatively …