Google Akademik

JL Bez, S Byna, S Ibrahim - ACM Computing Surveys, 2023 - dl.acm.org

The high-performance computing I/O stack has been complex due to multiple software
layers, the inter-dependencies among these layers, and the different performance tuning …

Kaydet Alıntı yap Alıntılanma sayısı: 19 İlgili makaleler 8 sürümün hepsi Kütüphane Araması

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding data storage and ingestion for large-scale deep recommendation model training: Industrial product

M Zhao, N Agarwal, A Basant, B Gedik, S Pan… - Proceedings of the 49th …, 2022 - dl.acm.org

Datacenter-scale AI training clusters consisting of thousands of domain-specific accelerators
(DSA) are used to train increasingly-complex deep learning models. These clusters rely on a …

Kaydet Alıntı yap Alıntılanma sayısı: 95 İlgili makaleler 5 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Analyzing and mitigating data stalls in DNN training

J Mohan, A Phanishayee, A Raniwala… - arxiv preprint arxiv …, 2020 - arxiv.org

Training Deep Neural Networks (DNNs) is resource-intensive and time-consuming. While
prior research has explored many different ways of reducing DNN training time, the impact of …

Kaydet Alıntı yap Alıntılanma sayısı: 126 İlgili makaleler 11 sürümün hepsi HTML olarak görüntüle

Fluid: Dataset abstraction and elastic acceleration for cloud-native deep learning training jobs

R Gu, K Zhang, Z Xu, Y Che, B Fan… - 2022 IEEE 38th …, 2022 - ieeexplore.ieee.org

Nowdays, it is prevalent to train deep learning (DL) models in cloud-native platforms that
actively leverage containerization and orchestration technologies for high elasticity, low and …

Kaydet Alıntı yap Alıntılanma sayısı: 55 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{SHADE}: Enable fundamental cacheability for distributed deep learning training

RIS Khan, AH Yazdani, Y Fu, AK Paul, B Ji… - … USENIX Conference on …, 2023 - usenix.org

Deep learning training (DLT) applications exhibit unique I/O workload behaviors that pose
new challenges for storage system design. DLT is I/O intensive since data samples need to …

Kaydet Alıntı yap Alıntılanma sayısı: 23 İlgili makaleler 12 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Clairvoyant prefetching for distributed machine learning I/O

N Dryden, R Böhringer, T Ben-Nun… - Proceedings of the …, 2021 - dl.acm.org

I/O is emerging as a major bottleneck for machine learning training, especially in distributed
environments. Indeed, at large scale, I/O takes as much as 85% of training time. Addressing …

Kaydet Alıntı yap Alıntılanma sayısı: 66 İlgili makaleler 22 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Quiver: An informed storage cache for deep learning

AV Kumar, M Sivathanu - 18th USENIX Conference on File and Storage …, 2020 - usenix.org

We introduce Quiver, an informed storage cache for deep learning training (DLT) jobs in a
cluster of GPUs. Quiver employs domain-specific intelligence within the caching layer, to …

Kaydet Alıntı yap Alıntılanma sayısı: 87 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

I/o characterization and performance evaluation of beegfs for deep learning

F Chowdhury, Y Zhu, T Heer, S Paredes… - Proceedings of the 48th …, 2019 - dl.acm.org

Parallel File Systems (PFSs) are frequently deployed on leadership High Performance
Computing (HPC) systems to ensure efficient I/O, persistent storage and scalable …

Kaydet Alıntı yap Alıntılanma sayısı: 85 İlgili makaleler 7 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] osti.gov

Deepfreeze: Towards scalable asynchronous checkpointing of deep learning models

B Nicolae, J Li, JM Wozniak, G Bosilca… - 2020 20th IEEE/ACM …, 2020 - ieeexplore.ieee.org

In the age of big data, deep learning has emerged as a powerful tool to extract insight and
exploit its value, both in industry and scientific applications. One common pattern emerging …

Kaydet Alıntı yap Alıntılanma sayısı: 69 İlgili makaleler 12 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Why globally re-shuffle? Revisiting data shuffling in large scale deep learning

TT Nguyen, F Trahay, J Domke, A Drozd… - 2022 IEEE …, 2022 - ieeexplore.ieee.org

Stochastic gradient descent (SGD) is the most prevalent algorithm for training Deep Neural
Networks (DNN). SGD iterates the input data set in each training epoch processing data …

Kaydet Alıntı yap Alıntılanma sayısı: 36 İlgili makaleler 10 sürümün hepsi

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Entropy-aware I/O pipelining for large-scale deep learning on HPC systems

I/o access patterns in hpc applications: A 360-degree survey

Understanding data storage and ingestion for large-scale deep recommendation model training: Industrial product

Analyzing and mitigating data stalls in DNN training

Fluid: Dataset abstraction and elastic acceleration for cloud-native deep learning training jobs

{SHADE}: Enable fundamental cacheability for distributed deep learning training

Clairvoyant prefetching for distributed machine learning I/O

Quiver: An informed storage cache for deep learning

I/o characterization and performance evaluation of beegfs for deep learning

Deepfreeze: Towards scalable asynchronous checkpointing of deep learning models

Why globally re-shuffle? Revisiting data shuffling in large scale deep learning