- Academic Search

T Dao, B Chen, K Liang, J Yang, Z Song… - arxiv preprint arxiv …, 2021 - arxiv.org

Overparameterized neural networks generalize well but are expensive to train. Ideally, one
would like to reduce their computational cost while retaining their generalization benefits …

保存引用被引用次数：82 相关文章所有 6 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Width and depth limits commute in residual networks

S Hayou, G Yang - International Conference on Machine …, 2023 - proceedings.mlr.press

We show that taking the width and depth to infinity in a deep neural network with skip
connections, when branches are scaled by $1/\sqrt {depth} $, result in the same covariance …

保存引用被引用次数：16 相关文章所有 6 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

On the infinite-depth limit of finite-width neural networks

S Hayou - Transactions on Machine Learning Research, 2022 - openreview.net

In this paper, we study the infinite-depth limit of finite-width residual neural networks with
random Gaussian weights. With proper scaling, we show that by fixing the width and taking …

保存引用被引用次数：25 相关文章所有 3 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Infinitely deep neural networks as diffusion processes

S Peluchetti, S Favaro - International Conference on Artificial …, 2020 - proceedings.mlr.press

When the parameters are independently and identically distributed (initialized) neural
networks exhibit undesirable properties that emerge as the number of layers increases, eg a …

保存引用被引用次数：43 相关文章所有 8 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Neural spectrum alignment: Empirical study

D Kopitkov, V Indelman - Artificial Neural Networks and Machine Learning …, 2020 - Springer

Expressiveness and generalization of deep models was recently addressed via the
connection between neural networks (NNs) and kernel learning, where first-order dynamics …

保存引用被引用次数：32 相关文章所有 9 个版本

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Commutative Width and Depth Scaling in Deep Neural Networks

S Hayou - arxiv preprint arxiv:2310.01683, 2023 - arxiv.org

This paper is the second in the series Commutative Scaling of Width and Depth (WD) about
commutativity of infinite width and depth limits in deep neural networks. Our aim is to …

保存引用相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Doubly infinite residual neural networks: a diffusion process approach

S Peluchetti, S Favaro - Journal of Machine Learning Research, 2021 - jmlr.org

Modern neural networks featuring a large number of layers (depth) and units per layer
(width) have achieved a remarkable performance across many domains. While there exists …

保存引用被引用次数：6 相关文章所有 8 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] epfl.ch

Theory of Deep Learning: Neural Tangent Kernel and Beyond

AU Jacot-Guillarmod - 2022 - infoscience.epfl.ch

In the recent years, Deep Neural Networks (DNNs) have managed to succeed at tasks that
previously appeared impossible, such as human-level object recognition, text synthesis …

保存引用相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] Doubly infinite residual networks: a diffusion process approach

S Peluchetti, S Favaro - stat, 2020 - researchgate.net

When neural network's parameters are initialized as iid, neural networks exhibit undesirable
forward and backward properties as the number of layers increases, eg, vanishing …

保存引用被引用次数：1 相关文章 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Wide Neural Networks are Interpolating Kernel Methods: Impact of Initialization on Generalization

M Nonnenmacher, D Reeb, I Steinwart - openreview.net

The recently developed link between strongly overparametrized neural networks (NNs) and
kernel methods has opened a new way to understand puzzling features of NNs, such as …

保存引用相关文章 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Training dynamics of deep networks using stochastic gradient descent via neural tangent kernel

Pixelated butterfly: Simple and efficient sparse training for neural network models

Width and depth limits commute in residual networks

On the infinite-depth limit of finite-width neural networks

Infinitely deep neural networks as diffusion processes

Neural spectrum alignment: Empirical study

Commutative Width and Depth Scaling in Deep Neural Networks

Doubly infinite residual neural networks: a diffusion process approach

Theory of Deep Learning: Neural Tangent Kernel and Beyond

[PDF][PDF] Doubly infinite residual networks: a diffusion process approach

Wide Neural Networks are Interpolating Kernel Methods: Impact of Initialization on Generalization