- Academic Search

M Caron, H Touvron, I Misra, H Jégou… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this paper, we question if self-supervised learning provides new properties to Vision
Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the …

Save Cite Cited by 6145 Related articles All 16 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Robust fine-tuning of zero-shot models

M Wortsman, G Ilharco, JW Kim, M Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of
data distributions when performing zero-shot inference (ie, without fine-tuning on a specific …

Save Cite Cited by 680 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mdpi.com

Robust design optimization and emerging technologies for electrical machines: Challenges and open problems

T Orosz, A Rassõlkin, A Kallaste, P Arsénio, D Pánek… - Applied Sciences, 2020 - mdpi.com

The bio-inspired algorithms are novel, modern, and efficient tools for the design of electrical
machines. However, from the mathematical point of view, these problems belong to the most …

Save Cite Cited by 127 Related articles All 18 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] rug.nl

Stochastic actor-oriented models for network dynamics

TAB Snijders - Annual review of statistics and its application, 2017 - annualreviews.org

This article discusses the stochastic actor-oriented model for analyzing panel data of
networks. The model is defined as a continuous-time Markov chain, observed at two or more …

Save Cite Cited by 322 Related articles All 11 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] siam.org

Optimization methods for large-scale machine learning

L Bottou, FE Curtis, J Nocedal - SIAM review, 2018 - SIAM

This paper provides a review and commentary on the past, present, and future of numerical
optimization algorithms in the context of machine learning applications. Through case …

Save Cite Cited by 4207 Related articles All 19 versions Free GPT-4

[Free GPT-4]

[PDF] semanticscholar.org

Averaging weights leads to wider optima and better generalization

P Izmailov, D Podoprikhin, T Garipov, D Vetrov… - arxiv preprint arxiv …, 2018 - arxiv.org

Deep neural networks are typically trained by optimizing a loss function with an SGD variant,
in conjunction with a decaying learning rate, until convergence. We show that simple …

Save Cite Cited by 1810 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Analyzing and improving the training dynamics of diffusion models

T Karras, M Aittala, J Lehtinen… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models currently dominate the field of data-driven image synthesis with their
unparalleled scaling to large datasets. In this paper we identify and rectify several causes for …

Save Cite Cited by 91 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

A simple baseline for bayesian uncertainty in deep learning

WJ Maddox, P Izmailov, T Garipov… - Advances in neural …, 2019 - proceedings.neurips.cc

Abstract We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose
approach for uncertainty representation and calibration in deep learning. Stochastic Weight …

Save Cite Cited by 955 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Lookahead optimizer: k steps forward, 1 step back

M Zhang, J Lucas, J Ba… - Advances in neural …, 2019 - proceedings.neurips.cc

The vast majority of successful deep neural networks are trained using variants of stochastic
gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly …

Save Cite Cited by 860 Related articles All 14 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Sparsified SGD with memory

SU Stich, JB Cordonnier… - Advances in neural …, 2018 - proceedings.neurips.cc

Huge scale machine learning problems are nowadays tackled by distributed optimization
algorithms, ie algorithms that leverage the compute power of many devices for training. The …

Save Cite Cited by 951 Related articles All 10 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Efficient estimations from a slowly convergent Robbins-Monro process

Emerging properties in self-supervised vision transformers

Robust fine-tuning of zero-shot models

Robust design optimization and emerging technologies for electrical machines: Challenges and open problems

Stochastic actor-oriented models for network dynamics

Optimization methods for large-scale machine learning

Averaging weights leads to wider optima and better generalization

Analyzing and improving the training dynamics of diffusion models

A simple baseline for bayesian uncertainty in deep learning

Lookahead optimizer: k steps forward, 1 step back

Sparsified SGD with memory