- Academic Search

SR Dubey, SK Singh, BB Chaudhuri - Neurocomputing, 2022 - Elsevier

Neural networks have shown tremendous growth in recent years to solve numerous
problems. Various types of neural networks have been introduced to deal with different types …

Zapisz Cytuj Cytowane przez 860 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

A review of activation function for artificial neural network

AD Rasamoelina, F Adjailia… - 2020 IEEE 18th World …, 2020 - ieeexplore.ieee.org

The activation function plays an important role in the training and the performance of an
Artificial Neural Network. They provide the necessary non-linear properties to any Artificial …

Zapisz Cytuj Cytowane przez 418 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Understanding self-supervised learning dynamics without contrastive pairs

Y Tian, X Chen, S Ganguli - International Conference on …, 2021 - proceedings.mlr.press

While contrastive approaches of self-supervised learning (SSL) learn representations by
minimizing the distance between two augmented views of the same data point (positive …

Zapisz Cytuj Cytowane przez 337 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dive into deep learning

A Zhang, ZC Lipton, M Li, AJ Smola - arxiv preprint arxiv:2106.11342, 2021 - arxiv.org

This open-source book represents our attempt to make deep learning approachable,
teaching readers the concepts, the context, and the code. The entire book is drafted in …

Zapisz Cytuj Cytowane przez 1227 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Finite versus infinite neural networks: an empirical study

J Lee, S Schoenholz, J Pennington… - Advances in …, 2020 - proceedings.neurips.cc

We perform a careful, thorough, and large scale empirical study of the correspondence
between wide neural networks and kernel methods. By doing so, we resolve a variety of …

Zapisz Cytuj Cytowane przez 238 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Rezero is all you need: Fast convergence at large depth

T Bachlechner, BP Majumder, H Mao… - Uncertainty in …, 2021 - proceedings.mlr.press

Deep networks often suffer from vanishing or exploding gradients due to inefficient signal
propagation, leading to long training times or convergence difficulties. Various architecture …

Zapisz Cytuj Cytowane przez 332 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Understanding the difficulty of training transformers

L Liu, X Liu, J Gao, W Chen, J Han - arxiv preprint arxiv:2004.08249, 2020 - arxiv.org

Transformers have proved effective in many NLP tasks. However, their training requires non-
trivial efforts regarding designing cutting-edge optimizers and learning rate schedulers …

Zapisz Cytuj Cytowane przez 303 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Fixup initialization: Residual learning without normalization

H Zhang, YN Dauphin, T Ma - arxiv preprint arxiv:1901.09321, 2019 - arxiv.org

Normalization layers are a staple in state-of-the-art deep neural network architectures. They
are widely believed to stabilize training, enable higher learning rate, accelerate …

Zapisz Cytuj Cytowane przez 395 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Sorting out Lipschitz function approximation

C Anil, J Lucas, R Grosse - International Conference on …, 2019 - proceedings.mlr.press

Training neural networks under a strict Lipschitz constraint is useful for provable adversarial
robustness, generalization bounds, interpretable gradients, and Wasserstein distance …

Zapisz Cytuj Cytowane przez 393 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] annualreviews.org

Statistical mechanics of deep learning

Y Bahri, J Kadmon, J Pennington… - Annual Review of …, 2020 - annualreviews.org

The recent striking success of deep neural networks in machine learning raises profound
questions about the theoretical principles underlying their success. For example, what can …

Zapisz Cytuj Cytowane przez 290 Powiązane artykuły Wszystkie wersje 7

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice

Activation functions in deep learning: A comprehensive survey and benchmark

A review of activation function for artificial neural network

Understanding self-supervised learning dynamics without contrastive pairs

Dive into deep learning

Finite versus infinite neural networks: an empirical study

Rezero is all you need: Fast convergence at large depth

Understanding the difficulty of training transformers

Fixup initialization: Residual learning without normalization

Sorting out Lipschitz function approximation

Statistical mechanics of deep learning