An overview on restricted Boltzmann machines
Abstract The Restricted Boltzmann Machine (RBM) has aroused wide interest in machine
learning fields during the past decade. This review aims to report the recent developments in …
learning fields during the past decade. This review aims to report the recent developments in …
Preconditioned stochastic gradient Langevin dynamics for deep neural networks
Effective training of deep neural networks suffers from two main issues. The first is that the
parameter space of these models exhibit pathological curvature. Recent methods address …
parameter space of these models exhibit pathological curvature. Recent methods address …
CNN and RNN based payload classification methods for attack detection
In recent years, machine learning has been widely applied to problems in detecting network
attacks, particularly novel attacks. However, traditional machine learning methods depend …
attacks, particularly novel attacks. However, traditional machine learning methods depend …
Preconditioned stochastic gradient descent
XL Li - IEEE transactions on neural networks and learning …, 2017 - ieeexplore.ieee.org
Stochastic gradient descent (SGD) still is the workhorse for many practical problems.
However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to …
However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to …
Bridging the gap between stochastic gradient MCMC and stochastic optimization
Abstract Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian
analogs to popular stochastic optimization methods; however, this connection is not well …
analogs to popular stochastic optimization methods; however, this connection is not well …
Tool wear state recognition based on gradient boosting decision tree and hybrid classification RBM
G Li, Y Wang, J He, Q Hao, H Yang, J Wei - The International Journal of …, 2020 - Springer
Machined surface quality and dimensional accuracy are significantly affected by tool wear in
machining process. Tool wear state (TWS) recognition is highly desirable to realize …
machining process. Tool wear state (TWS) recognition is highly desirable to realize …
Learning weight uncertainty with stochastic gradient mcmc for shape classification
Learning the representation of shape cues in 2D & 3D objects for recognition is a
fundamental task in computer vision. Deep neural networks (DNNs) have shown promising …
fundamental task in computer vision. Deep neural networks (DNNs) have shown promising …
Old optimizer, new norm: An anthology
J Bernstein, L Newhouse - arxiv preprint arxiv:2409.20325, 2024 - arxiv.org
Deep learning optimizers are often motivated through a mix of convex and approximate
second-order theory. We select three such methods--Adam, Shampoo and Prodigy--and …
second-order theory. We select three such methods--Adam, Shampoo and Prodigy--and …
Controlling the Inductive Bias of Wide Neural Networks by Modifying the Kernel's Spectrum
Wide neural networks are biased towards learning certain functions, influencing both the
rate of convergence of gradient descent (GD) and the functions that are reachable with GD …
rate of convergence of gradient descent (GD) and the functions that are reachable with GD …
Modular Duality in Deep Learning
J Bernstein, L Newhouse - arxiv preprint arxiv:2410.21265, 2024 - arxiv.org
An old idea in optimization theory says that since the gradient is a dual vector it may not be
subtracted from the weights without first being mapped to the primal space where the …
subtracted from the weights without first being mapped to the primal space where the …