Google Наука

G Vardi - Communications of the ACM, 2023 - dl.acm.org

On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

Запазване Позоваване С позовавания в 105 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Trained transformers learn linear models in-context

R Zhang, S Frei, PL Bartlett - Journal of Machine Learning Research, 2024 - jmlr.org

Attention-based neural networks such as transformers have demonstrated a remarkable
ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an …

Запазване Позоваване С позовавания в 221 Сродни статии Всички 7 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Surgical fine-tuning improves adaptation to distribution shifts

Y Lee, AS Chen, F Tajwar, A Kumar, H Yao… - arxiv preprint arxiv …, 2022 - arxiv.org

A common approach to transfer learning under distribution shift is to fine-tune the last few
layers of a pre-trained model, preserving learned features while also adapting to the new …

Запазване Позоваване С позовавания в 211 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fine-tuning can distort pretrained features and underperform out-of-distribution

A Kumar, A Raghunathan, R Jones, T Ma… - arxiv preprint arxiv …, 2022 - arxiv.org

When transferring a pretrained model to a downstream task, two popular methods are full
fine-tuning (updating all the model parameters) and linear probing (updating only the last …

Запазване Позоваване С позовавания в 741 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models

X **e, P Zhou, H Li, Z Lin, S Yan - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

In deep learning, different kinds of deep networks typically need different optimizers, which
have to be chosen after multiple trials, making the training process inefficient. To relieve this …

Запазване Позоваване С позовавания в 171 Сродни статии Всички 11 версии

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Pruning neural networks without any data by iteratively conserving synaptic flow

H Tanaka, D Kunin, DL Yamins… - Advances in neural …, 2020 - proceedings.neurips.cc

Pruning the parameters of deep neural networks has generated intense interest due to
potential savings in time, memory and energy both during training and at test time. Recent …

Запазване Позоваване С позовавания в 758 Сродни статии Всички 8 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Understanding self-supervised learning dynamics without contrastive pairs

Y Tian, X Chen, S Ganguli - International Conference on …, 2021 - proceedings.mlr.press

While contrastive approaches of self-supervised learning (SSL) learn representations by
minimizing the distance between two augmented views of the same data point (positive …

Запазване Позоваване С позовавания в 338 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On exact computation with an infinitely wide neural net

S Arora, SS Du, W Hu, Z Li… - Advances in neural …, 2019 - proceedings.neurips.cc

How well does a classic deep net architecture like AlexNet or VGG19 classify on a standard
dataset such as CIFAR-10 when its “width”—namely, number of channels in convolutional …

Запазване Позоваване С позовавания в 1069 Сродни статии Всички 8 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The modern mathematics of deep learning

J Berner, P Grohs, G Kutyniok… - arxiv preprint arxiv …, 2021 - cambridge.org

We describe the new field of the mathematical analysis of deep learning. This field emerged
around a list of research questions that were not answered within the classical framework of …

Запазване Позоваване С позовавания в 228 Сродни статии Всички 12 версии

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Implicit regularization in deep matrix factorization

S Arora, N Cohen, W Hu, Y Luo - Advances in neural …, 2019 - proceedings.neurips.cc

Efforts to understand the generalization mystery in deep learning have led to the belief that
gradient-based optimization induces a form of implicit regularization, a bias towards models …

Запазване Позоваване С позовавания в 620 Сродни статии Всички 10 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Algorithmic regularization in learning deep homogeneous models: Layers are automatically balanced

On the implicit bias in deep-learning algorithms

Trained transformers learn linear models in-context

Surgical fine-tuning improves adaptation to distribution shifts

Fine-tuning can distort pretrained features and underperform out-of-distribution

Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models

Pruning neural networks without any data by iteratively conserving synaptic flow

Understanding self-supervised learning dynamics without contrastive pairs

On exact computation with an infinitely wide neural net

The modern mathematics of deep learning

Implicit regularization in deep matrix factorization