Gradient norm aware minimization seeks first-order flatness and improves generalization

X Zhang, R Xu, H Yu, H Zou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recently, flat minima are proven to be effective for improving generalization and sharpness-
aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of …

A modern look at the relationship between sharpness and generalization

M Andriushchenko, F Croce, M Müller, M Hein… - arxiv preprint arxiv …, 2023 - arxiv.org
Sharpness of minima is a promising quantity that can correlate with generalization in deep
networks and, when optimized during training, can improve generalization. However …

Visual mamba: A survey and new outlooks

R Xu, S Yang, Y Wang, Y Cai, B Du, H Chen - arxiv preprint arxiv …, 2024 - arxiv.org
Mamba, a recent selective structured state space model, excels in long sequence modeling,
which is vital in the large model era. Long sequence modeling poses significant challenges …

Friendly sharpness-aware minimization

T Li, P Zhou, Z He, X Cheng… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Sharpness-Aware Minimization (SAM) has been instrumental in improving deep
neural network training by minimizing both training loss and loss sharpness. Despite the …

Flatmatch: Bridging labeled data and unlabeled data with cross-sharpness for semi-supervised learning

Z Huang, L Shen, J Yu, B Han… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Semi-Supervised Learning (SSL) has been an effective way to leverage abundant
unlabeled data with extremely scarce labeled data. However, most SSL methods are …

Flatness-aware minimization for domain generalization

X Zhang, R Xu, H Yu, Y Dong… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Domain generalization (DG) seeks to learn robust models that generalize well
under unknown distribution shifts. As a critical aspect of DG, optimizer selection has not …

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …