Learning and generalization in overparameterized neural networks, going beyond two layers

Z Allen-Zhu, Y Li, Y Liang - Advances in neural information …, 2019 - proceedings.neurips.cc
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Page 1 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two …

A mean field view of the landscape of two-layer neural networks

S Mei, A Montanari, PM Nguyen - Proceedings of the …, 2018 - National Acad Sciences
Multilayer neural networks are among the most powerful models in machine learning, yet the
fundamental reasons for this success defy mathematical understanding. Learning a neural …

Theoretical insights into the optimization landscape of over-parameterized shallow neural networks

M Soltanolkotabi, A Javanmard… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
In this paper, we study the problem of learning a shallow artificial neural network that best
fits a training data set. We study this problem in the over-parameterized regime where the …

[HTML][HTML] A comparison of deep networks with ReLU activation function and linear spline-type methods

K Eckle, J Schmidt-Hieber - Neural Networks, 2019 - Elsevier
Deep neural networks (DNNs) generate much richer function spaces than shallow networks.
Since the function spaces induced by shallow networks have several approximation …

Rademacher complexity for adversarially robust generalization

D Yin, R Kannan, P Bartlett - International conference on …, 2019 - proceedings.mlr.press
Many machine learning models are vulnerable to adversarial attacks; for example, adding
adversarial perturbations that are imperceptible to humans can often make machine …

Simple recurrent units for highly parallelizable recurrence

T Lei, Y Zhang, SI Wang, H Dai, Y Artzi - arxiv preprint arxiv:1709.02755, 2017 - arxiv.org
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in
parallelizing their state computations. In this work, we propose the Simple Recurrent Unit …

Beyond sparsity: Tree regularization of deep models for interpretability

M Wu, M Hughes, S Parbhoo, M Zazzi, V Roth… - Proceedings of the …, 2018 - ojs.aaai.org
The lack of interpretability remains a key barrier to the adoption of deep models in many
applications. In this work, we explicitly regularize deep models so human users might step …

Recovery guarantees for one-hidden-layer neural networks

K Zhong, Z Song, P Jain, PL Bartlett… - … on machine learning, 2017 - proceedings.mlr.press
In this paper, we consider regression problems with one-hidden-layer neural networks
(1NNs). We distill some properties of activation functions that lead to local strong convexity …

Learning one-hidden-layer neural networks with landscape design

R Ge, JD Lee, T Ma - arxiv preprint arxiv:1711.00501, 2017 - arxiv.org
We consider the problem of learning a one-hidden-layer neural network: we assume the
input $ x\in\mathbb {R}^ d $ is from Gaussian distribution and the label $ y= a^\top\sigma …

What can resnet learn efficiently, going beyond kernels?

Z Allen-Zhu, Y Li - Advances in Neural Information …, 2019 - proceedings.neurips.cc
How can neural networks such as ResNet\emph {efficiently} learn CIFAR-10 with test
accuracy more than $96\% $, while other methods, especially kernel methods, fall relatively …