- Academic Search

M Sharma, S Farquhar, E Nalisnick… - International …, 2023 - proceedings.mlr.press

We investigate the benefit of treating all the parameters in a Bayesian neural network
stochastically and find compelling theoretical and empirical evidence that this standard …

Save Cite Cited by 51 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Separation of scales and a thermodynamic description of feature learning in some cnns

I Seroussi, G Naveh, Z Ringel - Nature Communications, 2023 - nature.com

Deep neural networks (DNNs) are powerful tools for compressing and distilling information.
Their scale and complexity, often involving billions of inter-dependent parameters, render …

Save Cite Cited by 53 Related articles All 9 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sam as an optimal relaxation of bayes

T Möllenhoff, ME Khan - arxiv preprint arxiv:2210.01620, 2022 - arxiv.org

Sharpness-aware minimization (SAM) and related adversarial deep-learning methods can
drastically improve generalization, but their underlying mechanisms are not yet fully …

Save Cite Cited by 27 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On the detrimental effect of invariances in the likelihood for variational inference

R Kurle, R Herbrich, T Januschowski… - Advances in …, 2022 - proceedings.neurips.cc

Variational Bayesian posterior inference often requires simplifying approximations such as
mean-field parametrisation to ensure tractability. However, prior work has associated the …

Save Cite Cited by 10 Related articles All 7 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Variational learning is effective for large deep networks

Y Shen, N Daheim, B Cong, P Nickl… - arxiv preprint arxiv …, 2024 - arxiv.org

We give extensive empirical evidence against the common belief that variational learning is
ineffective for large neural networks. We show that an optimizer called Improved Variational …

Save Cite Cited by 14 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Markov chain score ascent: A unifying framework of variational inference with markovian gradients

K Kim, J Oh, J Gardner, AB Dieng… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient
descent (SGD) is challenging since its gradient is defined as an integral over the posterior …

Save Cite Cited by 15 Related articles All 8 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sparse MoEs meet efficient ensembles

JU Allingham, F Wenzel, ZE Mariet, B Mustafa… - arxiv preprint arxiv …, 2021 - arxiv.org

Machine learning models based on the aggregated outputs of submodels, either at the
activation or prediction levels, often exhibit strong performance compared to individual …

Save Cite Cited by 23 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Streamlining Prediction in Bayesian Deep Learning

R Li, M Klasson, A Solin, M Trapp - arxiv preprint arxiv:2411.18425, 2024 - arxiv.org

The rising interest in Bayesian deep learning (BDL) has led to a plethora of methods for
estimating the posterior distribution. However, efficient computation of inferences, such as …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Law of large numbers for bayesian two-layer neural network trained with variational inference

A Descours, T Huix, A Guillin, M Michel… - The Thirty Sixth …, 2023 - proceedings.mlr.press

We provide a rigorous analysis of training by variational inference (VI) of Bayesian neural
networks in the two-layer and infinite-width case. We consider a regression problem with a …

Save Cite Cited by 2 Related articles All 7 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the disconnect between theory and practice of overparametrized neural networks

J Wenger, F Dangel, A Kristiadi - arxiv preprint arxiv:2310.00137, 2023 - arxiv.org

The infinite-width limit of neural networks (NNs) has garnered significant attention as a
theoretical framework for analyzing the behavior of large-scale, overparametrized networks …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

Create alert

Cite

Advanced search

Saved to My library

Wide mean-field bayesian neural networks ignore the data

Do bayesian neural networks need to be fully stochastic?

Separation of scales and a thermodynamic description of feature learning in some cnns

Sam as an optimal relaxation of bayes

On the detrimental effect of invariances in the likelihood for variational inference

Variational learning is effective for large deep networks

Markov chain score ascent: A unifying framework of variational inference with markovian gradients

Sparse MoEs meet efficient ensembles

Streamlining Prediction in Bayesian Deep Learning

Law of large numbers for bayesian two-layer neural network trained with variational inference

On the disconnect between theory and practice of overparametrized neural networks