A selective overview of deep learning
Deep learning has achieved tremendous success in recent years. In simple words, deep
learning uses the composition of many nonlinear functions to model the complex …
learning uses the composition of many nonlinear functions to model the complex …
[PDF][PDF] The computational limits of deep learning
Deep learning's recent history has been one of achievement: from triumphing over humans
in the game of Go to world-leading performance in image classification, voice recognition …
in the game of Go to world-leading performance in image classification, voice recognition …
Learning imbalanced datasets with label-distribution-aware margin loss
Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-
imbalance but the testing criterion requires good generalization on less frequent classes …
imbalance but the testing criterion requires good generalization on less frequent classes …
Fantastic generalization measures and where to find them
Generalization of deep networks has been of great interest in recent years, resulting in a
number of theoretically and empirically motivated complexity measures. However, most …
number of theoretically and empirically motivated complexity measures. However, most …
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
Recent works have cast some light on the mystery of why deep nets fit any data and
generalize despite being very overparametrized. This paper analyzes training and …
generalize despite being very overparametrized. This paper analyzes training and …
A theoretical analysis of deep Q-learning
Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …
foundation is less well understood. In this work, we make the first attempt to theoretically …
The pitfalls of simplicity bias in neural networks
Several works have proposed Simplicity Bias (SB)---the tendency of standard training
procedures such as Stochastic Gradient Descent (SGD) to find simple models---to justify why …
procedures such as Stochastic Gradient Descent (SGD) to find simple models---to justify why …
Gradient descent optimizes over-parameterized deep ReLU networks
We study the problem of training deep fully connected neural networks with Rectified Linear
Unit (ReLU) activation function and cross entropy loss function for binary classification using …
Unit (ReLU) activation function and cross entropy loss function for binary classification using …
Frequency principle: Fourier analysis sheds light on deep neural networks
We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis
perspective. We demonstrate a very universal Frequency Principle (F-Principle)---DNNs …
perspective. We demonstrate a very universal Frequency Principle (F-Principle)---DNNs …
Generalization bounds of stochastic gradient descent for wide and deep neural networks
We study the training and generalization of deep neural networks (DNNs) in the over-
parameterized regime, where the network width (ie, number of hidden nodes per layer) is …
parameterized regime, where the network width (ie, number of hidden nodes per layer) is …