- Academic Search

Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com

In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …

保存引用被引用数: 119 関連記事全 5 バージョンキャッシュ

[Free GPT-4]

[PDF] arxiv.org

Neural collapse: A review on modelling principles and generalization

V Kothapalli - arxiv preprint arxiv:2206.04041, 2022 - arxiv.org

Deep classifier neural networks enter the terminal phase of training (TPT) when training
error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural …

保存引用被引用数: 81 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

保存引用被引用数: 875 関連記事全 27 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Autoformer: Searching transformers for visual recognition

M Chen, H Peng, J Fu, H Ling - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Recently, pure transformer-based models have shown great potentials for vision tasks such
as image classification and detection. However, the design of transformer networks is …

保存引用被引用数: 373 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Fedbn: Federated learning on non-iid features via local batch normalization

X Li, M Jiang, X Zhang, M Kamp, Q Dou - arxiv preprint arxiv:2102.07623, 2021 - arxiv.org

The emerging paradigm of federated learning (FL) strives to enable collaborative training of
deep models on the network edge without centrally aggregating raw data and hence …

保存引用被引用数: 968 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

Hidden progress in deep learning: Sgd learns parities near the computational limit

B Barak, B Edelman, S Goel… - Advances in …, 2022 - proceedings.neurips.cc

There is mounting evidence of emergent phenomena in the capabilities of deep learning
methods as we scale up datasets, model sizes, and training times. While there are some …

保存引用被引用数: 138 関連記事全 8 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2012.09816, 2020 - arxiv.org

We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

保存引用被引用数: 456 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

M Belkin - Acta Numerica, 2021 - cambridge.org

In the past decade the mathematical theory of machine learning has lagged far behind the
triumphs of deep neural networks on practical challenges. However, the gap between theory …

保存引用被引用数: 267 関連記事全 6 バージョン

[Free GPT-4]

[PDF] neurips.cc

Gradient starvation: A learning proclivity in neural networks

M Pezeshki, O Kaba, Y Bengio… - Advances in …, 2021 - proceedings.neurips.cc

We identify and formalize a fundamental gradient descent phenomenon resulting in a
learning proclivity in over-parameterized neural networks. Gradient Starvation arises when …

保存引用被引用数: 311 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Theory of overparametrization in quantum neural networks

M Larocca, N Ju, D García-Martín, PJ Coles… - Nature Computational …, 2023 - nature.com

The prospect of achieving quantum advantage with quantum neural networks (QNNs) is
exciting. Understanding how QNN properties (for example, the number of parameters M) …

保存引用被引用数: 205 関連記事全 11 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Gradient descent provably optimizes over-parameterized neural networks

Recent advances in stochastic gradient descent in deep learning

Neural collapse: A review on modelling principles and generalization

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

Autoformer: Searching transformers for visual recognition

Fedbn: Federated learning on non-iid features via local batch normalization

Hidden progress in deep learning: Sgd learns parities near the computational limit

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

Gradient starvation: A learning proclivity in neural networks

Theory of overparametrization in quantum neural networks