On the implicit bias in deep-learning algorithms

G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

Neural networks are convex regularizers: Exact polynomial-time convex optimization formulations for two-layer networks

M Pilanci, T Ergen - International Conference on Machine …, 2020 - proceedings.mlr.press
We develop exact representations of training two-layer neural networks with rectified linear
units (ReLUs) in terms of a single convex program with number of variables polynomial in …

Implicit regularization towards rank minimization in relu networks

N Timor, G Vardi, O Shamir - International Conference on …, 2023 - proceedings.mlr.press
We study the conjectured relationship between the implicit regularization in neural networks,
trained with gradient-based methods, and rank minimization of their weight matrices …

On the effective number of linear regions in shallow univariate relu networks: Convergence guarantees and implicit bias

I Safran, G Vardi, JD Lee - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural
networks with a single hidden layer in a binary classification setting. We show that when the …

Revealing the structure of deep neural networks via convex duality

T Ergen, M Pilanci - International Conference on Machine …, 2021 - proceedings.mlr.press
We study regularized deep neural networks (DNNs) and introduce a convex analytic
framework to characterize the structure of the hidden layers. We show that a set of optimal …

Learning a neuron by a shallow relu network: Dynamics and implicit bias for correlated inputs

D Chistikov, M Englert, R Lazic - Advances in Neural …, 2023 - proceedings.neurips.cc
We prove that, for the fundamental regression task of learning a single neuron, training a
one-hidden layer ReLU network of any width by gradient flow from a small initialisation …

Global optimality beyond two layers: Training deep relu networks via convex programs

T Ergen, M Pilanci - International Conference on Machine …, 2021 - proceedings.mlr.press
Understanding the fundamental mechanism behind the success of deep neural networks is
one of the key challenges in the modern machine learning literature. Despite numerous …

How do minimum-norm shallow denoisers look in function space?

C Zeno, G Ongie, Y Blumenfeld… - Advances in …, 2023 - proceedings.neurips.cc
Neural network (NN) denoisers are an essential building block in many common tasks,
ranging from image reconstruction to image generation. However, the success of these …

Noisy interpolation learning with shallow univariate relu networks

N Joshi, G Vardi, N Srebro - arxiv preprint arxiv:2307.15396, 2023 - arxiv.org
Understanding how overparameterized neural networks generalize despite perfect
interpolation of noisy training data is a fundamental question. Mallinar et. al. 2022 noted that …

On margin maximization in linear and relu networks

G Vardi, O Shamir, N Srebro - Advances in Neural …, 2022 - proceedings.neurips.cc
The implicit bias of neural networks has been extensively studied in recent years. Lyu and Li
(2019) showed that in homogeneous networks trained with the exponential or the logistic …