Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Neural collapse: A review on modelling principles and generalization
V Kothapalli - arxiv preprint arxiv:2206.04041, 2022 - arxiv.org
Deep classifier neural networks enter the terminal phase of training (TPT) when training
error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural …
error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural …
Optimization for deep learning: An overview
RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …
networks is an interesting topic for theoretical research due to various reasons. First, its …
Ties-merging: Resolving interference when merging models
Transfer learning–ie, further fine-tuning a pre-trained model on a downstream task–can
confer significant advantages, including improved downstream performance, faster …
confer significant advantages, including improved downstream performance, faster …
The role of permutation invariance in linear mode connectivity of neural networks
In this paper, we conjecture that if the permutation invariance of neural networks is taken into
account, SGD solutions will likely have no barrier in the linear interpolation between them …
account, SGD solutions will likely have no barrier in the linear interpolation between them …
S4l: Self-supervised semi-supervised learning
This work tackles the problem of semi-supervised learning of image classifiers. Our main
insight is that the field of semi-supervised learning can benefit from the quickly advancing …
insight is that the field of semi-supervised learning can benefit from the quickly advancing …
Linear mode connectivity and the lottery ticket hypothesis
We study whether a neural network optimizes to the same, linearly connected minimum
under different samples of SGD noise (eg, random data order and augmentation). We find …
under different samples of SGD noise (eg, random data order and augmentation). We find …
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
Recent works have cast some light on the mystery of why deep nets fit any data and
generalize despite being very overparametrized. This paper analyzes training and …
generalize despite being very overparametrized. This paper analyzes training and …
The modern mathematics of deep learning
We describe the new field of the mathematical analysis of deep learning. This field emerged
around a list of research questions that were not answered within the classical framework of …
around a list of research questions that were not answered within the classical framework of …
Gradient descent finds global minima of deep neural networks
Gradient descent finds a global minimum in training deep neural networks despite the
objective function being non-convex. The current paper proves gradient descent achieves …
objective function being non-convex. The current paper proves gradient descent achieves …
Zipit! merging models from different tasks without training
Typical deep visual recognition models are capable of performing the one task they were
trained on. In this paper, we tackle the extremely difficult problem of combining completely …
trained on. In this paper, we tackle the extremely difficult problem of combining completely …