AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality
Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …
solution operators can provide a fast approximate alternative to classical numerical partial …
Temperature Optimization for Bayesian Deep Learning
The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where
tempering the posterior to a cold temperature often improves the predictive performance of …
tempering the posterior to a cold temperature often improves the predictive performance of …
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT)
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …
A PAC-Bayesian Perspective on the Interpolating Information Criterion
Deep learning is renowned for its theory-practice gap, whereby principled theory typically
fails to provide much beneficial guidance for implementation in practice. This has been …
fails to provide much beneficial guidance for implementation in practice. This has been …
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an
interpolating threshold with over-parameterization, which is not predicted by information …
interpolating threshold with over-parameterization, which is not predicted by information …
[HTML][HTML] On Implicit Smoothness Regularization in Deep Learning
M Gamba - 2024 - diva-portal.org
State of the art neural networks provide a rich class of function approximators, fueling the
remarkable success of gradient-based deep learning on complex high-dimensional …
remarkable success of gradient-based deep learning on complex high-dimensional …
[HTML][HTML] Using uncertainty quantification to characterize and improve out-of-domain learning for PDEs
Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …
solution operators can provide a fast approximate alternative to classical numerical partial …
[PDF][PDF] AN ASYMPTOTICALLY OPTIMAL METHOD FOR CONSTRAINED STOCHASTIC OPTIMIZATION BY SEN NA, YIHANG GAO 2, MICHAEL K. NG 3, AND …
We perform statistical inference for the solution of stochastic optimization problems with
equality and box inequality constraints. The considered problems are prevalent in statistics …
equality and box inequality constraints. The considered problems are prevalent in statistics …