Provable Tempered Overfitting of Minimal Nets and Typical Nets

I Harel, WM Hoza, G Vardi, I Evron, N Srebro… - arxiv preprint arxiv …, 2024‏ - arxiv.org
We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary
weights fitted to perfectly classify a noisy training set. We consider interpolation using both …

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels

Y Slutzky, Y Alexander, N Razin, N Cohen - arxiv preprint arxiv …, 2024‏ - arxiv.org
Neural networks are powered by an implicit bias: a tendency of gradient descent to fit
training data in a way that generalizes to unseen data. A recent class of neural network …

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

D Qiao, K Zhang, E Singh, D Soudry… - arxiv preprint arxiv …, 2024‏ - arxiv.org
We study the generalization of two-layer ReLU neural networks in a univariate
nonparametric regression problem with noisy labels. This is a problem where kernels …