Google Academic

JS Kang, JK Kang, JJ Kim, KW Jeon, HJ Chung… - Sensors, 2023 - mdpi.com

In recent years, deep learning (DL) has been widely studied using various methods across
the globe, especially with respect to training methods and network structures, proving highly …

Salvați Citați Citat de 54 ori Articole cu conținut similar Toate cele 9 versiuni În cache

Transforming large-size to lightweight deep neural networks for IoT applications

R Mishra, H Gupta - ACM Computing Surveys, 2023 - dl.acm.org

Deep Neural Networks (DNNs) have gained unprecedented popularity due to their high-
order performance and automated feature extraction capability. This has encouraged …

Salvați Citați Citat de 44 ori Articole cu conținut similar

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc

Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

Salvați Citați Citat de 253 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Salvați Citați Citat de 897 ori Articole cu conținut similar Toate cele 27 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

The lottery ticket hypothesis for pre-trained bert networks

T Chen, J Frankle, S Chang, S Liu… - Advances in neural …, 2020 - proceedings.neurips.cc

In natural language processing (NLP), enormous pre-trained models like BERT have
become the standard starting point for training on a range of downstream tasks, and similar …

Salvați Citați Citat de 410 ori Articole cu conținut similar Toate cele 9 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Linear mode connectivity and the lottery ticket hypothesis

J Frankle, GK Dziugaite, D Roy… - … on Machine Learning, 2020 - proceedings.mlr.press

We study whether a neural network optimizes to the same, linearly connected minimum
under different samples of SGD noise (eg, random data order and augmentation). We find …

Salvați Citați Citat de 613 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Where to begin? on the impact of pre-training and initialization in federated learning

J Nguyen, J Wang, K Malik, M Sanjabi… - ar**

H Liang, S Zhang, J Sun, X He, W Huang… - arxiv preprint arxiv …, 2019 - arxiv.org

Recently, there has been a growing interest in automating the process of neural architecture
design, and the Differentiable Architecture Search (DARTS) method makes the process …

Salvați Citați Citat de 366 ori Articole cu conținut similar Toate cele 3 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Accelerating dataset distillation via model augmentation

L Zhang, J Zhang, B Lei, S Mukherjee… - Proceedings of the …, 2023 - openaccess.thecvf.com

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but
efficient synthetic training datasets from large ones. Existing DD methods based on gradient …

Salvați Citați Citat de 67 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Understanding the role of training regimes in continual learning

SI Mirzadeh, M Farajtabar, R Pascanu… - Advances in …, 2020 - proceedings.neurips.cc

Catastrophic forgetting affects the training of neural networks, limiting their ability to learn
multiple tasks sequentially. From the perspective of the well established plasticity-stability …

Salvați Citați Citat de 240 ori Articole cu conținut similar Toate cele 9 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

The early phase of neural network training

Neural architecture search survey: A computer vision perspective

Transforming large-size to lightweight deep neural networks for IoT applications

Memorization without overfitting: Analyzing the training dynamics of large language models

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

The lottery ticket hypothesis for pre-trained bert networks

Linear mode connectivity and the lottery ticket hypothesis

Where to begin? on the impact of pre-training and initialization in federated learning

Accelerating dataset distillation via model augmentation

Understanding the role of training regimes in continual learning