- Academic Search

Dynamical isometry and a mean field theory of cnns: How to train 10,000-layer vanilla convolution...

D Ghimire, D Kil, S Kim - Electronics, 2022 - mdpi.com

Over the past decade, deep-learning-based representations have demonstrated remarkable
performance in academia and industry. The learning capability of convolutional neural …

保存引用被引用数: 185 関連記事全 5 バージョンキャッシュ

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

保存引用被引用数: 981 関連記事全 2 バージョン

[Free GPT-4]

[PDF] neurips.cc

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc

Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …

保存引用被引用数: 115 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Going deeper with image transformers

H Touvron, M Cord, A Sablayrolles… - Proceedings of the …, 2021 - openaccess.thecvf.com

Transformers have been recently adapted for large scale image classification, achieving
high scores shaking up the long supremacy of convolutional neural networks. However the …

保存引用被引用数: 1211 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Repvgg: Making vgg-style convnets great again

X Ding, X Zhang, N Ma, J Han… - Proceedings of the …, 2021 - openaccess.thecvf.com

We present a simple but powerful architecture of convolutional neural network, which has a
VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and …

保存引用被引用数: 2242 関連記事全 15 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Dive into deep learning

A Zhang, ZC Lipton, M Li, AJ Smola - arxiv preprint arxiv:2106.11342, 2021 - arxiv.org

This open-source book represents our attempt to make deep learning approachable,
teaching readers the concepts, the context, and the code. The entire book is drafted in …

保存引用被引用数: 1221 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]

[PDF] mlr.press

On layer normalization in the transformer architecture

R **ong, Y Yang, D He, K Zheng… - International …, 2020 - proceedings.mlr.press

The Transformer is widely used in natural language processing tasks. To train a Transformer
however, one usually needs a carefully designed learning rate warm-up stage, which is …

保存引用被引用数: 1107 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Picking winning tickets before training by preserving gradient flow

C Wang, G Zhang, R Grosse - arxiv preprint arxiv:2002.07376, 2020 - arxiv.org

Overparameterization has been shown to benefit both the optimization and generalization of
neural networks, but large networks are resource hungry at both training and test time …

保存引用被引用数: 724 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

Wide neural networks of any depth evolve as linear models under gradient descent

J Lee, L **ao, S Schoenholz, Y Bahri… - Advances in neural …, 2019 - proceedings.neurips.cc

A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …

保存引用被引用数: 1198 関連記事全 13 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Pre-training via denoising for molecular property prediction

S Zaidi, M Schaarschmidt, J Martens, H Kim… - arxiv preprint arxiv …, 2022 - arxiv.org

Many important problems involving molecular property prediction from 3D structures have
limited data, posing a generalization challenge for neural networks. In this paper, we …

保存引用被引用数: 116 関連記事全 4 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Dynamical isometry and a mean field theory of cnns: How to train 10,000-layer vanilla convolution...

A survey on efficient convolutional neural networks and hardware acceleration

Model compression and hardware acceleration for neural networks: A comprehensive survey

Spike-driven transformer

Going deeper with image transformers

Repvgg: Making vgg-style convnets great again

Dive into deep learning

On layer normalization in the transformer architecture

Picking winning tickets before training by preserving gradient flow

Wide neural networks of any depth evolve as linear models under gradient descent

Pre-training via denoising for molecular property prediction