Študovňa Google

A Zhang, ZC Lipton, M Li, AJ Smola - 2023 - books.google.com

Deep learning has revolutionized pattern recognition, introducing tools that power a wide
range of technologies in such diverse ﬁelds as computer vision, natural language …

Uložiť Citovať Citované 1631-krát Súvisiace články Všetky verzie 14

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Towards efficient and scalable sharpness-aware minimization

Y Liu, S Mai, X Chen, CJ Hsieh… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Abstract Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of
the loss landscape and generalization, has demonstrated a significant performance boost …

Uložiť Citovať Citované 147-krát Súvisiace články Všetky verzie 7 HTML verzia

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Deep leakage from gradients

L Zhu, Z Liu, S Han - Advances in neural information …, 2019 - proceedings.neurips.cc

Passing gradient is a widely used scheme in modern multi-node learning system (eg,
distributed training, collaborative learning). In a long time, people used to believe that …

Uložiť Citovať Citované 2715-krát Súvisiace články Všetky verzie 15 HTML verzia

[免费ChatGPT] [DeepSeek可用网址] [PDF] nsf.gov

PipeDream: Generalized pipeline parallelism for DNN training

D Narayanan, A Harlap, A Phanishayee… - Proceedings of the 27th …, 2019 - dl.acm.org

DNN training is extremely time-consuming, necessitating efficient multi-accelerator
parallelization. Current approaches to parallelizing training primarily use intra-batch …

Uložiť Citovať Citované 942-krát Súvisiace články Všetky verzie 18

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Lookahead optimizer: k steps forward, 1 step back

M Zhang, J Lucas, J Ba… - Advances in neural …, 2019 - proceedings.neurips.cc

The vast majority of successful deep neural networks are trained using variants of stochastic
gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly …

Uložiť Citovať Citované 867-krát Súvisiace články Všetky verzie 14 HTML verzia

[免费ChatGPT] [DeepSeek可用网址] [PDF] mlr.press

Grad-match: Gradient matching based data subset selection for efficient deep model training

K Killamsetty, S Durga… - International …, 2021 - proceedings.mlr.press

The great success of modern machine learning models on large datasets is contingent on
extensive computational resources with high financial and environmental costs. One way to …

Uložiť Citovať Citované 247-krát Súvisiace články Všetky verzie 6 HTML verzia

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Large batch optimization for deep learning: Training bert in 76 minutes

Y You, J Li, S Reddi, J Hseu, S Kumar… - arxiv preprint arxiv …, 2019 - arxiv.org

Training large deep neural networks on massive datasets is computationally very
challenging. There has been recent surge in interest in using large batch stochastic …

Uložiť Citovať Citované 1141-krát Súvisiace články Všetky verzie 10 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Highly scalable deep learning training system with mixed-precision: Training imagenet in...

Model compression and hardware acceleration for neural networks: A comprehensive survey

[KNIHA][B] Dive into deep learning