On the implicit bias in deep-learning algorithms
G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …
successful in recent years and has led to dramatic improvements in multiple domains …
Optimization for deep learning: An overview
RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …
networks is an interesting topic for theoretical research due to various reasons. First, its …
On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks
Physics-informed neural networks (PINNs) are demonstrating remarkable promise in
integrating physical models with gappy and noisy observational data, but they still struggle …
integrating physical models with gappy and noisy observational data, but they still struggle …
How neural networks extrapolate: From feedforward to graph neural networks
We study how neural networks trained by gradient descent extrapolate, ie, what they learn
outside the support of the training distribution. Previous works report mixed empirical results …
outside the support of the training distribution. Previous works report mixed empirical results …
Universal approximation with deep narrow networks
P Kidger, T Lyons - Conference on learning theory, 2020 - proceedings.mlr.press
Abstract The classical Universal Approximation Theorem holds for neural networks of
arbitrary width and bounded depth. Here we consider the natural 'dual'scenario for networks …
arbitrary width and bounded depth. Here we consider the natural 'dual'scenario for networks …
Kernel and rich regimes in overparametrized models
A recent line of work studies overparametrized neural networks in the “kernel regime,” ie
when during training the network behaves as a kernelized linear predictor, and thus, training …
when during training the network behaves as a kernelized linear predictor, and thus, training …
Gradient descent on two-layer nets: Margin maximization and simplicity bias
The generalization mystery of overparametrized deep nets has motivated efforts to
understand how gradient descent (GD) converges to low-loss solutions that generalize well …
understand how gradient descent (GD) converges to low-loss solutions that generalize well …
Neural fields as learnable kernels for 3d reconstruction
Abstract We present Neural Kernel Fields: a novel method for reconstructing implicit 3D
shapes based on a learned kernel ridge regression. Our technique achieves state-of-the-art …
shapes based on a learned kernel ridge regression. Our technique achieves state-of-the-art …
Implicit regularization towards rank minimization in relu networks
We study the conjectured relationship between the implicit regularization in neural networks,
trained with gradient-based methods, and rank minimization of their weight matrices …
trained with gradient-based methods, and rank minimization of their weight matrices …
Banach space representer theorems for neural networks and ridge splines
We develop a variational framework to understand the properties of the functions learned by
neural networks fit to data. We propose and study a family of continuous-domain linear …
neural networks fit to data. We propose and study a family of continuous-domain linear …