- Academic Search

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org

Substantial progress has been made recently on develo** provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

Save Cite Cited by 521 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org Full View

On the implicit bias in deep-learning algorithms

G Vardi - Communications of the ACM, 2023 - dl.acm.org

On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

Save Cite Cited by 97 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Save Cite Cited by 4712 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Fine-tuning can distort pretrained features and underperform out-of-distribution

A Kumar, A Raghunathan, R Jones, T Ma… - arxiv preprint arxiv …, 2022 - arxiv.org

When transferring a pretrained model to a downstream task, two popular methods are full
fine-tuning (updating all the model parameters) and linear probing (updating only the last …

Save Cite Cited by 714 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] pnas.org Full View

Reconciling modern machine-learning practice and the classical bias–variance trade-off

M Belkin, D Hsu, S Ma… - Proceedings of the …, 2019 - National Acad Sciences

Breakthroughs in machine learning are rapidly changing science and society, yet our
fundamental understanding of this technology has lagged far behind. Indeed, one of the …

Save Cite Cited by 2288 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] cambridge.org

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

Save Cite Cited by 381 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] jmlr.org

The implicit bias of gradient descent on separable data

D Soudry, E Hoffer, MS Nacson, S Gunasekar… - Journal of Machine …, 2018 - jmlr.org

We examine gradient descent on unregularized logistic regression problems, with
homogeneous linear predictors on linearly separable datasets. We show the predictor …

Save Cite Cited by 1052 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

On the global convergence of gradient descent for over-parameterized models using optimal transport

L Chizat, F Bach - Advances in neural information …, 2018 - proceedings.neurips.cc

Many tasks in machine learning and signal processing can be solved by minimizing a
convex function of a measure. This includes sparse spikes deconvolution or training a …

Save Cite Cited by 897 Related articles All 13 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Learning overparameterized neural networks via stochastic gradient descent on structured data

Y Li, Y Liang - Advances in neural information processing …, 2018 - proceedings.neurips.cc

Neural networks have many successful applications, while much less theoretical
understanding has been gained. Towards bridging this gap, we study the problem of …

Save Cite Cited by 758 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Gradient starvation: A learning proclivity in neural networks

M Pezeshki, O Kaba, Y Bengio… - Advances in …, 2021 - proceedings.neurips.cc

We identify and formalize a fundamental gradient descent phenomenon resulting in a
learning proclivity in over-parameterized neural networks. Gradient Starvation arises when …

Save Cite Cited by 311 Related articles All 7 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Implicit regularization in matrix factorization

Nonconvex optimization meets low-rank matrix factorization: An overview

On the implicit bias in deep-learning algorithms

On the opportunities and risks of foundation models

Fine-tuning can distort pretrained features and underperform out-of-distribution

Reconciling modern machine-learning practice and the classical bias–variance trade-off

Deep learning: a statistical viewpoint

The implicit bias of gradient descent on separable data

On the global convergence of gradient descent for over-parameterized models using optimal transport

Learning overparameterized neural networks via stochastic gradient descent on structured data

Gradient starvation: A learning proclivity in neural networks