Towards a mathematical understanding of neural network-based machine learning: what we know and what we don't
The purpose of this article is to review the achievements made in the last few years towards
the understanding of the reasons behind the success and subtleties of neural network …
the understanding of the reasons behind the success and subtleties of neural network …
The Barron space and the flow-induced function spaces for neural network models
One of the key issues in the analysis of machine learning models is to identify the
appropriate function space and norm for the model. This is the set of functions endowed with …
appropriate function space and norm for the model. This is the set of functions endowed with …
Characterization of the variation spaces corresponding to shallow neural networks
We study the variation space corresponding to a dictionary of functions in L 2 (Ω) for a
bounded domain Ω⊂ R d. Specifically, we compare the variation space, which is defined in …
bounded domain Ω⊂ R d. Specifically, we compare the variation space, which is defined in …
Penalising the biases in norm regularisation enforces sparsity
Controlling the parameters' norm often yields good generalisation when training neural
networks. Beyond simple intuitions, the relation between regularising parameters' norm and …
networks. Beyond simple intuitions, the relation between regularising parameters' norm and …
Optimized injection of noise in activation functions to improve generalization of neural networks
This paper proposes a flexible probabilistic activation function that enhances the training
and operation of artificial neural networks by intentionally injecting noise to gain additional …
and operation of artificial neural networks by intentionally injecting noise to gain additional …
Transformers learn nonlinear features in context: Nonconvex mean-field dynamics on the attention landscape
Large language models based on the Transformer architecture have demonstrated
impressive capabilities to learn in context. However, existing theoretical studies on how this …
impressive capabilities to learn in context. However, existing theoretical studies on how this …
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
J Park, I Pelakh, S Wojtowytsch - Advances in Neural …, 2023 - proceedings.neurips.cc
We investigate how shallow ReLU networks interpolate between known regions. Our
analysis shows that empirical risk minimizers converge to a minimum norm interpolant as …
analysis shows that empirical risk minimizers converge to a minimum norm interpolant as …
Generalization error bounds for deep neural networks trained by sgd
Generalization error bounds for deep neural networks trained by stochastic gradient descent
(SGD) are derived by combining a dynamical control of an appropriate parameter norm and …
(SGD) are derived by combining a dynamical control of an appropriate parameter norm and …
[HTML][HTML] Embeddings between Barron spaces with higher-order activation functions
The approximation properties of infinitely wide shallow neural networks heavily depend on
the choice of the activation function. To understand this influence, we study embeddings …
the choice of the activation function. To understand this influence, we study embeddings …
Some observations on high-dimensional partial differential equations with barron data
We use explicit representation formulas to show that solutions to certain partial differential
equa-tions lie in Barron spaces or multilayer spaces if the PDE data lie in such function …
equa-tions lie in Barron spaces or multilayer spaces if the PDE data lie in such function …