Quantifying the impact of label noise on federated learning

S Ke, C Huang, X Liu - arxiv preprint arxiv:2211.07816, 2022 - arxiv.org
Federated Learning (FL) is a distributed machine learning paradigm where clients
collaboratively train a model using their local (human-generated) datasets. While existing …

The representation theory of neural networks

M Armenta, PM Jodoin - Mathematics, 2021 - mdpi.com
In this work, we show that neural networks can be represented via the mathematical theory
of quiver representations. More specifically, we prove that a neural network is a quiver …

-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space

Q Meng, S Zheng, H Zhang, W Chen, ZM Ma… - arxiv preprint arxiv …, 2018 - arxiv.org
It is well known that neural networks with rectified linear units (ReLU) activation functions are
positively scale-invariant. Conventional algorithms like stochastic gradient descent optimize …

A path-norm toolkit for modern networks: consequences, promises and challenges

A Gonon, N Brisebarre, E Riccietti… - arxiv preprint arxiv …, 2023 - arxiv.org
This work introduces the first toolkit around path-norms that is fully able to encompass
general DAG ReLU networks with biases, skip connections and any operation based on the …

Positively scale-invariant flatness of relu neural networks

M Yi, Q Meng, W Chen, Z Ma, TY Liu - arxiv preprint arxiv:1903.02237, 2019 - arxiv.org
It was empirically confirmed by Keskar et al.\cite {SharpMinima} that flatter minima
generalize better. However, for the popular ReLU network, sharp minimum can also …

A priori estimates of the population risk for residual networks

C Ma, Q Wang - arxiv preprint arxiv:1903.02154, 2019 - arxiv.org
Optimal a priori estimates are derived for the population risk, also known as the
generalization error, of a regularized residual network model. An important part of the …

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks

C Liu, L Hui - arxiv preprint arxiv:2305.08813, 2023 - arxiv.org
Rectified linear unit (ReLU), as a non-linear activation function, is well known to improve the
expressivity of neural networks such that any continuous function can be approximated to …