Google Académico

X Zhang, R Xu, H Yu, H Zou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recently, flat minima are proven to be effective for improving generalization and sharpness-
aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of …

Guardar Citar Citado por 52 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fastervit: Fast vision transformers with hierarchical attention

A Hatamizadeh, G Heinrich, H Yin, A Tao… - ar** efficient training techniques to make …

Guardar Citar Citado por 62 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A modern look at the relationship between sharpness and generalization

M Andriushchenko, F Croce, M Müller, M Hein… - arxiv preprint arxiv …, 2023 - arxiv.org

Sharpness of minima is a promising quantity that can correlate with generalization in deep
networks and, when optimized during training, can improve generalization. However …

Guardar Citar Citado por 64 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Visual mamba: A survey and new outlooks

R Xu, S Yang, Y Wang, Y Cai, B Du, H Chen - arxiv preprint arxiv …, 2024 - arxiv.org

Mamba, a recent selective structured state space model, excels in long sequence modeling,
which is vital in the large model era. Long sequence modeling poses significant challenges …

Guardar Citar Citado por 8 Artículos relacionados Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Friendly sharpness-aware minimization

T Li, P Zhou, Z He, X Cheng… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Sharpness-Aware Minimization (SAM) has been instrumental in improving deep
neural network training by minimizing both training loss and loss sharpness. Despite the …

Guardar Citar Citado por 14 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Flatmatch: Bridging labeled data and unlabeled data with cross-sharpness for semi-supervised learning

Z Huang, L Shen, J Yu, B Han… - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Semi-Supervised Learning (SSL) has been an effective way to leverage abundant
unlabeled data with extremely scarce labeled data. However, most SSL methods are …

Guardar Citar Citado por 27 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Flatness-aware minimization for domain generalization

X Zhang, R Xu, H Yu, Y Dong… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Domain generalization (DG) seeks to learn robust models that generalize well
under unknown distribution shifts. As a critical aspect of DG, optimizer selection has not …

Guardar Citar Citado por 24 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org

Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Guardar Citar Citado por 50 Artículos relacionados Las 4 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Sharpness-aware training for free

Gradient norm aware minimization seeks first-order flatness and improves generalization

Fastervit: Fast vision transformers with hierarchical attention

A modern look at the relationship between sharpness and generalization

Visual mamba: A survey and new outlooks

Friendly sharpness-aware minimization

Flatmatch: Bridging labeled data and unlabeled data with cross-sharpness for semi-supervised learning

Flatness-aware minimization for domain generalization

Compute-efficient deep learning: Algorithmic trends and opportunities