Μελετητής Google

J Ba, MA Erdogdu, T Suzuki, Z Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the first gradient descent step on the first-layer parameters $\boldsymbol {W} $ in a
two-layer neural network: $ f (\boldsymbol {x})=\frac {1}{\sqrt {N}}\boldsymbol {a}^\top\sigma …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 164 Σχετικά άρθρα Όλες οι 10 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] cambridge.org

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 387 Σχετικά άρθρα Όλες οι 11 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss

L Chizat, F Bach - Conference on learning theory, 2020 - proceedings.mlr.press

Neural networks trained to minimize the logistic (aka cross-entropy) loss with gradient-based
methods are observed to perform well in many supervised classification tasks. Towards …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 401 Σχετικά άρθρα Όλες οι 8 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On the global convergence of gradient descent for over-parameterized models using optimal transport

L Chizat, F Bach - Advances in neural information …, 2018 - proceedings.neurips.cc

Many tasks in machine learning and signal processing can be solved by minimizing a
convex function of a measure. This includes sparse spikes deconvolution or training a …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 900 Σχετικά άρθρα Όλες οι 14 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

S Mei, T Misiakiewicz… - Conference on learning …, 2019 - proceedings.mlr.press

We consider learning two layer neural networks using stochastic gradient descent. The
mean-field description of this learning dynamics approximates the evolution of the network …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 336 Σχετικά άρθρα Όλες οι 8 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mean-field langevin dynamics: Exponential convergence and annealing

L Chizat - arxiv preprint arxiv:2202.01009, 2022 - arxiv.org

Noisy particle gradient descent (NPGD) is an algorithm to minimize convex functions over
the space of measures that include an entropy term. In the many-particle limit, this algorithm …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 90 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gradient descent on infinitely wide neural networks: Global convergence and generalization

F Bach, L Chizat - arxiv preprint arxiv:2110.08084, 2021 - arxiv.org

Many supervised machine learning methods are naturally cast as optimization problems. For
prediction models which are linear in their parameters, this often leads to convex problems …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 32 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Convex analysis of the mean field langevin dynamics

A Nitanda, D Wu, T Suzuki - International Conference on …, 2022 - proceedings.mlr.press

As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics
recently attracts attention due to its connection to (noisy) gradient descent on infinitely wide …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 75 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sparse optimization on measures with over-parameterized gradient descent

L Chizat - Mathematical Programming, 2022 - Springer

Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical
problem arising, eg, in sparse spikes deconvolution or two-layer neural networks training …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 120 Σχετικά άρθρα Όλες οι 9 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Feature learning via mean-field langevin dynamics: classifying sparse parities and beyond

T Suzuki, D Wu, K Oko… - Advances in Neural …, 2023 - proceedings.neurips.cc

Neural network in the mean-field regime is known to be capable of\textit {feature learning},
unlike the kernel (NTK) counterpart. Recent works have shown that mean-field neural …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 15 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Stochastic particle gradient descent for infinite ensembles

High-dimensional asymptotics of feature learning: How one gradient step improves the representation

Deep learning: a statistical viewpoint

Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss

On the global convergence of gradient descent for over-parameterized models using optimal transport

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

Mean-field langevin dynamics: Exponential convergence and annealing

Gradient descent on infinitely wide neural networks: Global convergence and generalization

Convex analysis of the mean field langevin dynamics

Sparse optimization on measures with over-parameterized gradient descent

Feature learning via mean-field langevin dynamics: classifying sparse parities and beyond