- Academic Search

Z Liu, S Cheng, H Zhou, Y You - … of the International Conference for High …, 2023‏ - dl.acm.org‏

Large-scale language models have become increasingly challenging and expensive to
train. Among various methods addressing this issue, Pipeline Parallelism has been widely …‏

שמור צטט צוטט על ידי 27 מאמרים בנושא זה כל 7 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Bpipe: Memory-balanced pipeline parallelism for training large language models‏

T Kim, H Kim, GI Yu, BG Chun - International Conference on …, 2023‏ - proceedings.mlr.press‏

Pipeline parallelism is a key technique for training large language models within GPU
clusters. However, it often leads to a memory imbalance problem, where certain GPUs face …‏

שמור צטט צוטט על ידי 26 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Baechi: fast device placement of machine learning graphs‏

B Jeon, L Cai, P Srivastava, J Jiang, X Ke… - Proceedings of the 11th …, 2020‏ - dl.acm.org‏

Machine Learning graphs (or models) can be challenging or impossible to train when either
devices have limited memory, or the models are large. Splitting the model graph across …‏

שמור צטט צוטט על ידי 78 מאמרים בנושא זה כל 10 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Merak: An efficient distributed dnn training framework with automated 3d parallelism for giant foundation models‏

Z Lai, S Li, X Tang, K Ge, W Liu, Y Duan… - … on Parallel and …, 2023‏ - ieeexplore.ieee.org‏

Foundation models are in the process of becoming the dominant deep learning technology.
Pretraining a foundation model is always time-consuming due to the large scale of both the …‏

שמור צטט צוטט על ידי 42 מאמרים בנושא זה כל 5 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Model: memory optimizations for deep learning‏

B Steiner, M Elhoushi, J Kahn… - … on Machine Learning, 2023‏ - proceedings.mlr.press‏

The size of deep neural networks has grown exponentially in recent years. Unfortunately,
hardware devices have not kept pace with the rapidly increasing memory requirements. To …‏

שמור צטט צוטט על ידי 10 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Comparative Analysis of Distributed Training Strategies for GPT-2‏

I Patwardhan, S Gandhi, O Khare, A Joshi… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

The rapid advancement in Large Language Models has been met with significant
challenges in their training processes, primarily due to their considerable computational and …‏

שמור צטט צוטט על ידי 2 מאמרים בנושא זה כל 2 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Characterizing multi-instance gpu for machine learning workloads‏

B Li, V Gadepally, S Samsi… - 2022 IEEE International …, 2022‏ - ieeexplore.ieee.org‏

As machine learning (ML) becomes more and more popular, datacenter operators use
hardware accelerators such as GPUs to tackle the high computation demand of ML …‏

שמור צטט צוטט על ידי 22 מאמרים בנושא זה כל 4 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unicron: Economizing self-healing llm training at scale‏

T He, X Li, Z Wang, K Qian, J Xu, W Yu… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Training large-scale language models is increasingly critical in various domains, but it is
hindered by frequent failures, leading to significant time and economic costs. Current failure …‏

שמור צטט צוטט על ידי 8 מאמרים בנושא זה כל 2 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Automated tensor model parallelism with overlapped communication for efficient foundation model training‏

S Li, Z Lai, Y Hao, W Liu, K Ge, X Deng, D Li… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Deep learning is experiencing a rise in foundation models that are expected to lead in
various fields. The massive number of parameters necessitates the use of tensor model …‏

שמור צטט צוטט על ידי 5 מאמרים בנושא זה כל 3 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Comparative analysis of AWS model deployment services‏

R Bagai - arxiv preprint arxiv:2405.08175, 2024‏ - arxiv.org‏

Amazon Web Services (AWS) offers three important Model Deployment Services for model
developers: SageMaker, Lambda, and Elastic Container Service (ECS). These services …‏

שמור צטט צוטט על ידי 2 מאמרים בנושא זה כל 3 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

Amazon sagemaker model parallelism: A general and flexible framework for large model training

Hanayo: Harnessing wave-like pipeline parallelism for enhanced large model training efficiency‏

Bpipe: Memory-balanced pipeline parallelism for training large language models‏

Baechi: fast device placement of machine learning graphs‏

Merak: An efficient distributed dnn training framework with automated 3d parallelism for giant foundation models‏

Model: memory optimizations for deep learning‏

A Comparative Analysis of Distributed Training Strategies for GPT-2‏

Characterizing multi-instance gpu for machine learning workloads‏

Unicron: Economizing self-healing llm training at scale‏

Automated tensor model parallelism with overlapped communication for efficient foundation model training‏

Comparative analysis of AWS model deployment services‏