An overview and comparative analysis of recurrent neural networks for short term load forecasting
The key component in forecasting demand and consumption of resources in a supply
network is an accurate prediction of real-valued time series. Indeed, both service …
network is an accurate prediction of real-valued time series. Indeed, both service …
Gradient descent finds global minima of deep neural networks
Gradient descent finds a global minimum in training deep neural networks despite the
objective function being non-convex. The current paper proves gradient descent achieves …
objective function being non-convex. The current paper proves gradient descent achieves …
Wide neural networks of any depth evolve as linear models under gradient descent
A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …
and generalization. However, the often complex loss landscapes of neural networks have …
Deep neural networks as gaussian processes
It has long been known that a single-layer fully-connected neural network with an iid prior
over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network …
over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network …
[BOOK][B] The principles of deep learning theory
This textbook establishes a theoretical framework for understanding deep learning models
of practical relevance. With an approach that borrows from theoretical physics, Roberts and …
of practical relevance. With an approach that borrows from theoretical physics, Roberts and …
Generative learning for nonlinear dynamics
W Gilpin - Nature Reviews Physics, 2024 - nature.com
Modern generative machine learning models are able to create realistic outputs far beyond
their training data, such as photorealistic artwork, accurate protein structures or …
their training data, such as photorealistic artwork, accurate protein structures or …
Understanding batch normalization
Batch normalization (BN) is a technique to normalize activations in intermediate layers of
deep neural networks. Its tendency to improve accuracy and speed up training have …
deep neural networks. Its tendency to improve accuracy and speed up training have …
Understanding plasticity in neural networks
Plasticity, the ability of a neural network to quickly change its predictions in response to new
information, is essential for the adaptability and robustness of deep reinforcement learning …
information, is essential for the adaptability and robustness of deep reinforcement learning …
The shaped transformer: Attention models in the infinite depth-and-width limit
In deep learning theory, the covariance matrix of the representations serves as aproxy to
examine the network's trainability. Motivated by the success of Transform-ers, we study the …
examine the network's trainability. Motivated by the success of Transform-ers, we study the …
How good is the Bayes posterior in deep neural networks really?
During the past five years the Bayesian deep learning community has developed
increasingly accurate and efficient approximate inference procedures that allow for …
increasingly accurate and efficient approximate inference procedures that allow for …