- Academic Search

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Uložit Citovat Počet citací tohoto článku: 30 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism

S Eliad, I Hakimi, A De Jagger, M Silberstein… - 2021 USENIX Annual …, 2021 - usenix.org

Fine-tuning is an increasingly common technique that leverages transfer learning to
dramatically expedite the training of huge, high-quality models. Critically, fine-tuning holds …

Uložit Citovat Počet citací tohoto článku: 39 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training

S Zhao, F Li, X Chen, X Guan, J Jiang… - … on Parallel and …, 2021 - ieeexplore.ieee.org

The increasing computational complexity of DNNs achieved unprecedented successes in
various areas such as machine vision and natural language processing (NLP), eg, the …

Uložit Citovat Počet citací tohoto článku: 39 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Tsplit: Fine-grained gpu memory management for efficient dnn training via tensor splitting

X Nie, X Miao, Z Yang, B Cui - 2022 IEEE 38th International …, 2022 - ieeexplore.ieee.org

Since Deep Neural Networks (DNNs) are deeper and larger, performing DNNs training on
existing accelerators (eg, GPUs) is challenging due to their limited device memory capacity …

Uložit Citovat Počet citací tohoto článku: 21 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] google.com

A Survey on Spatio-temporal Big Data Analytics Ecosystem: Resource Management, Processing Platform, and Applications

H Liang, Z Zhang, C Hu, Y Gong… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the rapid evolution of the Internet, Internet of Things (IoT), and geographic information
systems (GIS), spatio-temporal Big Data (STBD) is experiencing exponential growth …

Uložit Citovat Počet citací tohoto článku: 13 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

S Singh, P Singhania, A Ranjan… - … Conference for High …, 2024 - ieeexplore.ieee.org

Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions
of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In …

Uložit Citovat Počet citací tohoto článku: 1 Související články Všechny verze (počet: 4)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

MegTaiChi: Dynamic tensor-based memory management optimization for DNN training

Z Hu, J **ao, Z Deng, M Li, K Zhang, X Zhang… - Proceedings of the 36th …, 2022 - dl.acm.org

In real applications, it is common to train deep neural networks (DNNs) on modest clusters.
With the continuous increase of model size and batch size, the training of DNNs becomes …

Uložit Citovat Počet citací tohoto článku: 12 Související články Všechny verze (počet: 2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An oracle for guiding large-scale model/hybrid parallel training of convolutional neural networks

AN Kahira, TT Nguyen, LB Gomez, R Takano… - Proceedings of the 30th …, 2021 - dl.acm.org

Deep Neural Network (DNN) frameworks use distributed training to enable faster time to
convergence and alleviate memory capacity limitations when training large models and/or …

Uložit Citovat Počet citací tohoto článku: 13 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PERKS: a Locality-Optimized Execution Model for Iterative Memory-bound GPU Applications

L Zhang, M Wahib, P Chen, J Meng, X Wang… - Proceedings of the 37th …, 2023 - dl.acm.org

Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU
implementations have a loop on the host side that invokes the GPU kernel as much as …

Uložit Citovat Počet citací tohoto článku: 7 Související články Všechny verze (počet: 5)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

An application-oblivious memory scheduling system for DNN accelerators

J Li, X Wang, X Chen, G Li, X Dong, P Zhao… - ACM Transactions on …, 2022 - dl.acm.org

Deep Neural Networks (DNNs) tend to go deeper and wider, which poses a significant
challenge to the training of DNNs, due to the limited memory capacity of DNN accelerators …

Uložit Citovat Počet citací tohoto článku: 3 Související články Všechny verze (počet: 2)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Scaling distributed deep learning workloads beyond the memory capacity with KARMA

Enabling resource-efficient aiot system with cross-level optimization: A survey

Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism

vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training

Tsplit: Fine-grained gpu memory management for efficient dnn training via tensor splitting

A Survey on Spatio-temporal Big Data Analytics Ecosystem: Resource Management, Processing Platform, and Applications

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

MegTaiChi: Dynamic tensor-based memory management optimization for DNN training

An oracle for guiding large-scale model/hybrid parallel training of convolutional neural networks

PERKS: a Locality-Optimized Execution Model for Iterative Memory-bound GPU Applications

An application-oblivious memory scheduling system for DNN accelerators