A survey of uncertainty in deep neural networks

J Gawlikowski, CRN Tassi, M Ali, J Lee, M Humt… - Artificial Intelligence …, 2023 - Springer
Over the last decade, neural networks have reached almost every field of science and
become a crucial part of various real world applications. Due to the increasing spread …

Global convergence of Langevin dynamics based algorithms for nonconvex optimization

P Xu, J Chen, D Zou, Q Gu - Advances in Neural …, 2018 - proceedings.neurips.cc
We present a unified framework to analyze the global convergence of Langevin dynamics
based algorithms for nonconvex finite-sum optimization with $ n $ component functions. At …

Learning-rate annealing methods for deep neural networks

K Nakamura, B Derbel, KJ Won, BW Hong - Electronics, 2021 - mdpi.com
Deep neural networks (DNNs) have achieved great success in the last decades. DNN is
optimized using the stochastic gradient descent (SGD) with learning rate annealing that …

Faster convergence of stochastic gradient langevin dynamics for non-log-concave sampling

D Zou, P Xu, Q Gu - Uncertainty in Artificial Intelligence, 2021 - proceedings.mlr.press
We provide a new convergence analysis of stochastic gradient Langevin dynamics (SGLD)
for sampling from a class of distributions that can be non-log-concave. At the core of our …

Accelerating approximate thompson sampling with underdamped langevin monte carlo

H Zheng, W Deng, C Moya… - … Conference on Artificial …, 2024 - proceedings.mlr.press
Abstract Approximate Thompson sampling with Langevin Monte Carlo broadens its reach
from Gaussian posterior sampling to encompass more general smooth posteriors. However …

Fractional underdamped langevin dynamics: Retargeting sgd with momentum under heavy-tailed gradient noise

U Simsekli, L Zhu, YW Teh… - … on machine learning, 2020 - proceedings.mlr.press
Stochastic gradient descent with momentum (SGDm) is one of the most popular optimization
algorithms in deep learning. While there is a rich theory of SGDm for convex problems, the …

Distributed learning systems with first-order methods

J Liu, C Zhang - Foundations and Trends® in Databases, 2020 - nowpublishers.com
Scalable and efficient distributed learning is one of the main driving forces behind the recent
rapid advancement of machine learning and artificial intelligence. One prominent feature of …

Adaptive weight decay for deep neural networks

K Nakamura, BW Hong - IEEE Access, 2019 - ieeexplore.ieee.org
Regularization in the optimization of deep neural networks is often critical to avoid
undesirable over-fitting leading to better generalization of model. One of the most popular …

Primal dual interpretation of the proximal stochastic gradient Langevin algorithm

A Salim, P Richtarik - Advances in Neural Information …, 2020 - proceedings.neurips.cc
We consider the task of sampling with respect to a log concave probability distribution. The
potential of the target distribution is assumed to be composite, ie, written as the sum of a …

On the convergence of Hamiltonian Monte Carlo with stochastic gradients

D Zou, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press
Abstract Hamiltonian Monte Carlo (HMC), built based on the Hamilton's equation, has been
witnessed great success in sampling from high-dimensional posterior distributions …