- Academic Search

Towards explaining the regularization effect of initial large learning rate in training neural networks

Y Li, C Wei, T Ma - Advances in neural information …, 2019 - proceedings.neurips.cc

Stochastic gradient descent with a large initial learning rate is widely used for training
modern neural net architectures. Although a small initial learning rate allows for faster …

Gem Citer Citeret af 372 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]

[PDF] neurips.cc

Deep learning through the lens of example difficulty

R Baldock, H Maennel… - Advances in Neural …, 2021 - proceedings.neurips.cc

Existing work on understanding deep learning often employs measures that compress all
data-dependent information into a few numbers. In this work, we adopt a perspective based …

Gem Citer Citeret af 162 Relaterede artikler Alle 8 versioner Vis som HTML

[Free GPT-4]

[PDF] mlr.press

Davinz: Data valuation using deep neural networks at initialization

Z Wu, Y Shu, BKH Low - International Conference on …, 2022 - proceedings.mlr.press

Recent years have witnessed a surge of interest in develo** trustworthy methods to
evaluate the value of data in many real-world applications (eg, collaborative machine …

Gem Citer Citeret af 61 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]

[PDF] arxiv.org

How does learning rate decay help modern neural networks?

K You, M Long, J Wang, MI Jordan - arxiv preprint arxiv:1908.01878, 2019 - arxiv.org

Learning rate decay (lrDecay) is a\emph {de facto} technique for training modern neural
networks. It starts with a large learning rate and then decays it multiple times. It is empirically …

Gem Citer Citeret af 302 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]

[PDF] mlr.press

Mechanistic mode connectivity

ES Lubana, EJ Bigelow, RP Dick… - International …, 2023 - proceedings.mlr.press

We study neural network loss landscapes through the lens of mode connectivity, the
observation that minimizers of neural networks retrieved via training on a dataset are …

Gem Citer Citeret af 48 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]

[PDF] arxiv.org

When do curricula work?

X Wu, E Dyer, B Neyshabur - arxiv preprint arxiv:2012.03107, 2020 - arxiv.org

Inspired by human learning, researchers have proposed ordering examples during training
based on their difficulty. Both curriculum learning, exposing a network to easier examples …

Gem Citer Citeret af 141 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]

[PDF] arxiv.org

T-mars: Improving visual representations by circumventing text feature learning

P Maini, S Goyal, ZC Lipton, JZ Kolter… - arxiv preprint arxiv …, 2023 - arxiv.org

Large web-sourced multimodal datasets have powered a slew of new methods for learning
general-purpose visual representations, advancing the state of the art in computer vision …

Gem Citer Citeret af 31 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]

[PDF] neurips.cc

Characterizing datapoints via second-split forgetting

P Maini, S Garg, Z Lipton… - Advances in Neural …, 2022 - proceedings.neurips.cc

Researchers investigating example hardness have increasingly focused on the dynamics by
which neural networks learn and forget examples throughout training. Popular metrics …

Gem Citer Citeret af 41 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]

[PDF] thecvf.com

Estimating example difficulty using variance of gradients

C Agarwal, D D'souza… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

In machine learning, a question of great interest is understanding what examples are
challenging for a model to classify. Identifying atypical examples ensures the safe …

Gem Citer Citeret af 120 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]

[PDF] nature.com

Detecting shortcut learning for fair medical AI using shortcut testing

A Brown, N Tomasev, J Freyberg, Y Liu… - Nature …, 2023 - nature.com

Abstract Machine learning (ML) holds great promise for improving healthcare, but it is critical
to ensure that its use will not propagate or amplify health disparities. An important step is to …

Gem Citer Citeret af 56 Relaterede artikler Alle 13 versioner

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Do deep neural networks learn shallow learnable examples first?

Towards explaining the regularization effect of initial large learning rate in training neural networks

Deep learning through the lens of example difficulty

Davinz: Data valuation using deep neural networks at initialization

How does learning rate decay help modern neural networks?

Mechanistic mode connectivity

When do curricula work?

T-mars: Improving visual representations by circumventing text feature learning

Characterizing datapoints via second-split forgetting

Estimating example difficulty using variance of gradients

Detecting shortcut learning for fair medical AI using shortcut testing