- Academic Search

S Ke, C Huang, X Liu - arxiv preprint arxiv:2211.07816, 2022 - arxiv.org

Federated Learning (FL) is a distributed machine learning paradigm where clients
collaboratively train a model using their local (human-generated) datasets. While existing …

Enregistrer Citer Cité 10 fois Autres articles Les 6 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

The representation theory of neural networks

M Armenta, PM Jodoin - Mathematics, 2021 - mdpi.com

In this work, we show that neural networks can be represented via the mathematical theory
of quiver representations. More specifically, we prove that a neural network is a quiver …

Enregistrer Citer Cité 27 fois Autres articles Les 7 versions Free GPT-4 DeepSeek En cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space

Q Meng, S Zheng, H Zhang, W Chen, ZM Ma… - arxiv preprint arxiv …, 2018 - arxiv.org

It is well known that neural networks with rectified linear units (ReLU) activation functions are
positively scale-invariant. Conventional algorithms like stochastic gradient descent optimize …

Enregistrer Citer Cité 33 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A path-norm toolkit for modern networks: consequences, promises and challenges

A Gonon, N Brisebarre, E Riccietti… - arxiv preprint arxiv …, 2023 - arxiv.org

This work introduces the first toolkit around path-norms that is fully able to encompass
general DAG ReLU networks with biases, skip connections and any operation based on the …

Enregistrer Citer Cité 4 fois Autres articles Les 9 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Positively scale-invariant flatness of relu neural networks

M Yi, Q Meng, W Chen, Z Ma, TY Liu - arxiv preprint arxiv:1903.02237, 2019 - arxiv.org

It was empirically confirmed by Keskar et al.\cite {SharpMinima} that flatter minima
generalize better. However, for the popular ReLU network, sharp minimum can also …

Enregistrer Citer Cité 25 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A priori estimates of the population risk for residual networks

C Ma, Q Wang - arxiv preprint arxiv:1903.02154, 2019 - arxiv.org

Optimal a priori estimates are derived for the population risk, also known as the
generalization error, of a regularized residual network model. An important part of the …

Enregistrer Citer Cité 23 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks

C Liu, L Hui - arxiv preprint arxiv:2305.08813, 2023 - arxiv.org

Rectified linear unit (ReLU), as a non-linear activation function, is well known to improve the
expressivity of neural networks such that any continuous function can be approximated to …

Enregistrer Citer Cité 8 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Capacity control of ReLU neural networks by basis-path norm

Quantifying the impact of label noise on federated learning

The representation theory of neural networks

-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space

A path-norm toolkit for modern networks: consequences, promises and challenges

Positively scale-invariant flatness of relu neural networks

A priori estimates of the population risk for residual networks

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks