- Academic Search

Zapisz Cytuj Cytowane przez 1307 Powiązane artykuły Wszystkie wersje 8

[PDF] springer.com

A survey of uncertainty in deep neural networks

J Gawlikowski, CRN Tassi, M Ali, J Lee, M Humt… - Artificial Intelligence …, 2023 - Springer

Over the last decade, neural networks have reached almost every field of science and
become a crucial part of various real world applications. Due to the increasing spread …

Zapisz Cytuj Cytowane przez 329 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

Laplace redux-effortless bayesian deep learning

E Daxberger, A Kristiadi, A Immer… - Advances in …, 2021 - proceedings.neurips.cc

Bayesian formulations of deep learning have been shown to have compelling theoretical
properties and offer practical functional benefits, such as improved predictive uncertainty …

Zapisz Cytuj Cytowane przez 124 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

Sophia: A scalable stochastic second-order optimizer for language model pre-training

H Liu, Z Li, D Hall, P Liang, T Ma - arxiv preprint arxiv:2305.14342, 2023 - arxiv.org

Given the massive cost of language model pre-training, a non-trivial improvement of the
optimization algorithm would lead to a material reduction on the time and cost of training …

Zapisz Cytuj Cytowane przez 129 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Studying large language model generalization with influence functions

R Grosse, J Bae, C Anil, N Elhage, A Tamkin… - arxiv preprint arxiv …, 2023 - arxiv.org

When trying to gain better visibility into a machine learning model in order to understand and
mitigate the associated risks, a potentially valuable source of evidence is: which training …

Zapisz Cytuj Cytowane przez 68 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

Make sharpness-aware minimization stronger: A sparsified perturbation approach

P Mi, L Shen, T Ren, Y Zhou, X Sun… - Advances in Neural …, 2022 - proceedings.neurips.cc

Deep neural networks often suffer from poor generalization caused by complex and non-
convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization …

Zapisz Cytuj Cytowane przez 234 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

Limitations of the empirical fisher approximation for natural gradient descent

F Kunstner, P Hennig, L Balles - Advances in neural …, 2019 - proceedings.neurips.cc

Natural gradient descent, which preconditions a gradient descent update with the Fisher
information matrix of the underlying statistical model, is a way to capture partial second …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Zapisz Cytuj Cytowane przez 44 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models

Y Kwon, E Wu, K Wu, J Zou - arxiv preprint arxiv:2310.00902, 2023 - arxiv.org

Quantifying the impact of training data points is crucial for understanding the outputs of
machine learning models and for improving the transparency of the AI pipeline. The …