Google Académico

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer

Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

Guardar Citar Citado por 175 Artículos relacionados Las 7 versiones

[Free GPT-4]

[PDF] arxiv.org

The global landscape of neural networks: An overview

R Sun, D Li, S Liang, T Ding… - IEEE Signal Processing …, 2020 - ieeexplore.ieee.org

One of the major concerns for neural network training is that the nonconvexity of the
associated loss functions may cause a bad landscape. The recent success of neural …

Guardar Citar Citado por 105 Artículos relacionados Las 4 versiones

[Free GPT-4]

[PDF] arxiv.org

Optimization for deep learning: theory and algorithms

R Sun - arxiv preprint arxiv:1912.08957, 2019 - arxiv.org

When and why can a neural network be successfully trained? This article provides an
overview of optimization algorithms and theory for training neural networks. First, we discuss …

Guardar Citar Citado por 251 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Mechanistic mode connectivity

ES Lubana, EJ Bigelow, RP Dick… - International …, 2023 - proceedings.mlr.press

We study neural network loss landscapes through the lens of mode connectivity, the
observation that minimizers of neural networks retrieved via training on a dataset are …

Guardar Citar Citado por 48 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

What Happens after SGD Reaches Zero Loss?--A Mathematical Framework

Z Li, T Wang, S Arora - arxiv preprint arxiv:2110.06914, 2021 - arxiv.org

Understanding the implicit bias of Stochastic Gradient Descent (SGD) is one of the key
challenges in deep learning, especially for overparametrized models, where the local …

Guardar Citar Citado por 112 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

B Simsek, F Ged, A Jacot, F Spadaro… - International …, 2021 - proceedings.mlr.press

We study how permutation symmetries in overparameterized multi-layer neural networks
generate 'symmetry-induced'critical points. Assuming a network with $ L $ layers of minimal …

Guardar Citar Citado por 97 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks

Q Nguyen, M Mondelli… - … Conference on Machine …, 2021 - proceedings.mlr.press

A recent line of work has analyzed the theoretical properties of deep neural networks via the
Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been …

Guardar Citar Citado por 87 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

Going beyond linear mode connectivity: The layerwise linear feature connectivity

Z Zhou, Y Yang, X Yang, J Yan… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent work has revealed many intriguing empirical phenomena in neural network training,
despite the poorly understood and highly complex loss landscapes and training dynamics …

Guardar Citar Citado por 20 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] ieee.org

Learning ReLU networks on linearly separable data: Algorithm, optimality, and generalization

G Wang, GB Giannakis, J Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org

Neural networks with rectified linear unit (ReLU) activation functions (aka ReLU networks)
have achieved great empirical success in various domains. Nonetheless, existing results for …

Guardar Citar Citado por 159 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] thecvf.com

Dart: Diversify-aggregate-repeat training improves generalization of neural networks

S Jain, S Addepalli, PK Sahu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Generalization of Neural Networks is crucial for deploying them safely in the real
world. Common training strategies to improve generalization involve the use of data …

Guardar Citar Citado por 23 Artículos relacionados Las 6 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

On connected sublevel sets in deep learning

Optimization for deep learning: An overview

The global landscape of neural networks: An overview

Optimization for deep learning: theory and algorithms

Mechanistic mode connectivity

What Happens after SGD Reaches Zero Loss?--A Mathematical Framework

Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep relu networks

Going beyond linear mode connectivity: The layerwise linear feature connectivity

Learning ReLU networks on linearly separable data: Algorithm, optimality, and generalization

Dart: Diversify-aggregate-repeat training improves generalization of neural networks