- Academic Search

FM Bianchi, E Maiorino, MC Kampffmeyer… - arxiv preprint arxiv …, 2017 - arxiv.org

The key component in forecasting demand and consumption of resources in a supply
network is an accurate prediction of real-valued time series. Indeed, both service …

Save Cite Cited by 267 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Gradient descent finds global minima of deep neural networks

S Du, J Lee, H Li, L Wang… - … conference on machine …, 2019 - proceedings.mlr.press

Gradient descent finds a global minimum in training deep neural networks despite the
objective function being non-convex. The current paper proves gradient descent achieves …

Save Cite Cited by 1430 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Wide neural networks of any depth evolve as linear models under gradient descent

J Lee, L **ao, S Schoenholz, Y Bahri… - Advances in neural …, 2019 - proceedings.neurips.cc

A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …

Save Cite Cited by 1202 Related articles All 13 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Deep neural networks as gaussian processes

J Lee, Y Bahri, R Novak, SS Schoenholz… - arxiv preprint arxiv …, 2017 - arxiv.org

It has long been known that a single-layer fully-connected neural network with an iid prior
over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network …

Save Cite Cited by 1364 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

[BOOK][B] The principles of deep learning theory

DA Roberts, S Yaida, B Hanin - 2022 - cambridge.org

This textbook establishes a theoretical framework for understanding deep learning models
of practical relevance. With an approach that borrows from theoretical physics, Roberts and …

Save Cite Cited by 372 Related articles All 11 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] arxiv.org

Generative learning for nonlinear dynamics

W Gilpin - Nature Reviews Physics, 2024 - nature.com

Modern generative machine learning models are able to create realistic outputs far beyond
their training data, such as photorealistic artwork, accurate protein structures or …

Save Cite Cited by 23 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Understanding batch normalization

N Bjorck, CP Gomes, B Selman… - Advances in neural …, 2018 - proceedings.neurips.cc

Batch normalization (BN) is a technique to normalize activations in intermediate layers of
deep neural networks. Its tendency to improve accuracy and speed up training have …

Save Cite Cited by 909 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Understanding plasticity in neural networks

C Lyle, Z Zheng, E Nikishin, BA Pires… - International …, 2023 - proceedings.mlr.press

Plasticity, the ability of a neural network to quickly change its predictions in response to new
information, is essential for the adaptability and robustness of deep reinforcement learning …

Save Cite Cited by 92 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

The shaped transformer: Attention models in the infinite depth-and-width limit

L Noci, C Li, M Li, B He, T Hofmann… - Advances in …, 2024 - proceedings.neurips.cc

In deep learning theory, the covariance matrix of the representations serves as aproxy to
examine the network's trainability. Motivated by the success of Transform-ers, we study the …

Save Cite Cited by 35 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

How good is the Bayes posterior in deep neural networks really?

F Wenzel, K Roth, BS Veeling, J Świątkowski… - arxiv preprint arxiv …, 2020 - arxiv.org

During the past five years the Bayesian deep learning community has developed
increasingly accurate and efficient approximate inference procedures that allow for …

Save Cite Cited by 417 Related articles All 8 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

An overview and comparative analysis of recurrent neural networks for short term load forecasting

Gradient descent finds global minima of deep neural networks

Wide neural networks of any depth evolve as linear models under gradient descent

Deep neural networks as gaussian processes

[BOOK][B] The principles of deep learning theory

Generative learning for nonlinear dynamics

Understanding batch normalization

Understanding plasticity in neural networks

The shaped transformer: Attention models in the infinite depth-and-width limit

How good is the Bayes posterior in deep neural networks really?