Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
On the implicit bias in deep-learning algorithms
G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …
successful in recent years and has led to dramatic improvements in multiple domains …
Trained transformers learn linear models in-context
Attention-based neural networks such as transformers have demonstrated a remarkable
ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an …
ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an …
Surgical fine-tuning improves adaptation to distribution shifts
A common approach to transfer learning under distribution shift is to fine-tune the last few
layers of a pre-trained model, preserving learned features while also adapting to the new …
layers of a pre-trained model, preserving learned features while also adapting to the new …
Fine-tuning can distort pretrained features and underperform out-of-distribution
When transferring a pretrained model to a downstream task, two popular methods are full
fine-tuning (updating all the model parameters) and linear probing (updating only the last …
fine-tuning (updating all the model parameters) and linear probing (updating only the last …
Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models
In deep learning, different kinds of deep networks typically need different optimizers, which
have to be chosen after multiple trials, making the training process inefficient. To relieve this …
have to be chosen after multiple trials, making the training process inefficient. To relieve this …
Pruning neural networks without any data by iteratively conserving synaptic flow
Pruning the parameters of deep neural networks has generated intense interest due to
potential savings in time, memory and energy both during training and at test time. Recent …
potential savings in time, memory and energy both during training and at test time. Recent …
Understanding self-supervised learning dynamics without contrastive pairs
While contrastive approaches of self-supervised learning (SSL) learn representations by
minimizing the distance between two augmented views of the same data point (positive …
minimizing the distance between two augmented views of the same data point (positive …
On exact computation with an infinitely wide neural net
How well does a classic deep net architecture like AlexNet or VGG19 classify on a standard
dataset such as CIFAR-10 when its “width”—namely, number of channels in convolutional …
dataset such as CIFAR-10 when its “width”—namely, number of channels in convolutional …
The modern mathematics of deep learning
We describe the new field of the mathematical analysis of deep learning. This field emerged
around a list of research questions that were not answered within the classical framework of …
around a list of research questions that were not answered within the classical framework of …
Implicit regularization in deep matrix factorization
Efforts to understand the generalization mystery in deep learning have led to the belief that
gradient-based optimization induces a form of implicit regularization, a bias towards models …
gradient-based optimization induces a form of implicit regularization, a bias towards models …