Recurrent neural networks as versatile tools of neuroscience research

O Barak - Current opinion in neurobiology, 2017 - Elsevier
Highlights•Recurrent neural networks (RNNs) are powerful models of neural systems.•RNNs
can be either designed or trained to perform a task.•In both cases, low dimensional …

Complete dictionary recovery over the sphere I: Overview and the geometric picture

J Sun, Q Qu, J Wright - IEEE Transactions on Information …, 2016 - ieeexplore.ieee.org
We consider the problem of recovering a complete (ie, square and invertible) matrix A 0,
from Y∈ R n× p with Y= A 0 X 0, provided X 0 is sufficiently sparse. This recovery problem is …

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

A convergence theory for deep learning via over-parameterization

Z Allen-Zhu, Y Li, Z Song - International conference on …, 2019 - proceedings.mlr.press
Deep neural networks (DNNs) have demonstrated dominating performance in many fields;
since AlexNet, networks used in practice are going wider and deeper. On the theoretical …

Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks

S Arora, S Du, W Hu, Z Li… - … conference on machine …, 2019 - proceedings.mlr.press
Recent works have cast some light on the mystery of why deep nets fit any data and
generalize despite being very overparametrized. This paper analyzes training and …

Gradient descent finds global minima of deep neural networks

S Du, J Lee, H Li, L Wang… - … conference on machine …, 2019 - proceedings.mlr.press
Gradient descent finds a global minimum in training deep neural networks despite the
objective function being non-convex. The current paper proves gradient descent achieves …

Learning and generalization in overparameterized neural networks, going beyond two layers

Z Allen-Zhu, Y Li, Y Liang - Advances in neural information …, 2019 - proceedings.neurips.cc
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Page 1 Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two …

Gradient descent provably optimizes over-parameterized neural networks

SS Du, X Zhai, B Poczos, A Singh - ar** is provably robust to label noise for overparameterized neural networks
M Li, M Soltanolkotabi, S Oymak - … conference on artificial …, 2020 - proceedings.mlr.press
Modern neural networks are typically trained in an over-parameterized regime where the
parameters of the model far exceed the size of the training data. Such neural networks in …