Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks

S Arora, S Du, W Hu, Z Li… - … Conference on Machine …, 2019 - proceedings.mlr.press
Recent works have cast some light on the mystery of why deep nets fit any data and
generalize despite being very overparametrized. This paper analyzes training and …

Gradient descent provably optimizes over-parameterized neural networks

SS Du, X Zhai, B Poczos, A Singh - arxiv preprint arxiv:1810.02054, 2018 - arxiv.org
One of the mysteries in the success of neural networks is randomly initialized first order
methods like gradient descent can achieve zero training loss even though the objective …

Fast neural kernel embeddings for general activations

I Han, A Zandieh, J Lee, R Novak… - Advances in neural …, 2022 - proceedings.neurips.cc
Infinite width limit has shed light on generalization and optimization aspects of deep learning
by establishing connections between neural networks and kernel methods. Despite their …

[PDF][PDF] Uncertainty in neural networks: Bayesian ensembling

T Pearce, M Zaki, A Brintrup, N Anastassacos, A Neely - stat, 2018 - researchgate.net
Understanding the uncertainty of a neural network's (NN) predictions is essential for many
applications. The Bayesian framework provides a principled approach to this, however …

A mathematical theory of relational generalization in transitive inference

S Lippl, K Kay, G Jensen, VP Ferrera… - Proceedings of the …, 2024 - pnas.org
Humans and animals routinely infer relations between different items or events and
generalize these relations to novel combinations of items. This allows them to respond …

Expressive priors in Bayesian neural networks: Kernel combinations and periodic functions

T Pearce, R Tsuchida, M Zaki… - Uncertainty in …, 2020 - proceedings.mlr.press
A simple, flexible approach to creating expressive priors in Gaussian process (GP) models
makes new kernels from a combination of basic kernels, eg summing a periodic and linear …

Periodic activation functions induce stationarity

L Meronen, M Trapp, A Solin - Advances in Neural …, 2021 - proceedings.neurips.cc
Neural network models are known to reinforce hidden data biases, making them unreliable
and difficult to interpret. We seek to build models thatknow what they do not know'by …

A connection between probability, physics and neural networks

S Ranftl - Physical Sciences Forum, 2022 - mdpi.com
I illustrate an approach that can be exploited for constructing neural networks that a priori
obey physical laws. We start with a simple single-layer neural network (NN) but refrain from …

Differential training: A generic framework to reduce label noises for android malware detection

J Xu, Y Li, RH Deng - 2021 - ink.library.smu.edu.sg
A common problem in machine learning-based malware detection is that training data may
contain noisy labels and it is challenging to make the training data noise-free at a large …

Squared neural families: a new class of tractable density models

R Tsuchida, CS Ong… - Advances in neural …, 2024 - proceedings.neurips.cc
Flexible models for probability distributions are an essential ingredient in many machine
learning tasks. We develop and investigate a new class of probability distributions, which we …