- Academic Search

P Qing, C Gao, Y Zhou, X Diao, Y Yang… - arxiv preprint arxiv …, 2024 - arxiv.org

Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …

Save Cite Cited by 3 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

SC Mouli, DC Maddix, S Alizadeh, G Gupta… - arxiv preprint arxiv …, 2024 - arxiv.org

Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …

Save Cite Cited by 6 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Temperature Optimization for Bayesian Deep Learning

K Ng, C van der Heide, L Hodgkinson, S Wei - arxiv preprint arxiv …, 2024 - arxiv.org

The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where
tempering the posterior to a cold temperature often improves the predictive performance of …

[Free GPT-4]

[PDF] arxiv.org

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

V Kothapalli, T Pang, S Deng, Z Liu, Y Yang - arxiv preprint arxiv …, 2024 - arxiv.org

Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT)
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …

Save Cite Cited by 3 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A PAC-Bayesian Perspective on the Interpolating Information Criterion

L Hodgkinson, C van der Heide, R Salomone… - arxiv preprint arxiv …, 2023 - arxiv.org

Deep learning is renowned for its theory-practice gap, whereby principled theory typically
fails to provide much beneficial guidance for implementation in practice. This has been …

[Free GPT-4]

[PDF] mlr.press

Gibbs-Based Information Criteria and the Over-Parameterized Regime

H Chen, GW Wornell, Y Bu - International Conference on …, 2024 - proceedings.mlr.press

Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an
interpolating threshold with over-parameterization, which is not predicted by information …

Save Cite Cited by 1 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[HTML] diva-portal.org

[HTML][HTML] On Implicit Smoothness Regularization in Deep Learning

M Gamba - 2024 - diva-portal.org

State of the art neural networks provide a rich class of function approximators, fueling the
remarkable success of gradient-based deep learning on complex high-dimensional …

[Free GPT-4]

[HTML] amazon.science

[HTML][HTML] Using uncertainty quantification to characterize and improve out-of-domain learning for PDEs

SC Mouli, DM Robinson, S Alizadeh, G Gupta, A Stuart… - 2024 - amazon.science

Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …

[Free GPT-4]

[PDF] github.io

[PDF][PDF] AN ASYMPTOTICALLY OPTIMAL METHOD FOR CONSTRAINED STOCHASTIC OPTIMIZATION BY SEN NA, YIHANG GAO 2, MICHAEL K. NG 3, AND …

SEN NA, Y GAO, MK NG - senna1128.github.io

We perform statistical inference for the solution of stochastic optimization problems with
equality and box inequality constraints. The considered problems are prevalent in statistics …

Save Cite Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

The interpolating information criterion for overparameterized models

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

Temperature Optimization for Bayesian Deep Learning

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

A PAC-Bayesian Perspective on the Interpolating Information Criterion

Gibbs-Based Information Criteria and the Over-Parameterized Regime

[HTML][HTML] On Implicit Smoothness Regularization in Deep Learning

[HTML][HTML] Using uncertainty quantification to characterize and improve out-of-domain learning for PDEs

[PDF][PDF] AN ASYMPTOTICALLY OPTIMAL METHOD FOR CONSTRAINED STOCHASTIC OPTIMIZATION BY SEN NA, YIHANG GAO 2, MICHAEL K. NG 3, AND …