- Academic Search

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Save Cite Cited by 126 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Efficient training of large language models on distributed infrastructures: a survey

J Duan, S Zhang, Z Wang, L Jiang, W Qu, Q Hu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with
their sophisticated capabilities. Training these models requires vast GPU clusters and …

Save Cite Cited by 6 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Towards Green AI: Current status and future research

C Clemm, L Stobbe, K Wimalawarne… - … Goes Green 2024+ …, 2024 - ieeexplore.ieee.org

The immense technological progress in artificial intelligence research and applications is
increasingly drawing attention to the environmental sustainability of such systems, a field …

Save Cite Cited by 2 Related articles All 5 versions Free GPT-4

AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost

J Chen, S Li, R Guo, J Yuan… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Recent advances in deep learning are driven by the growing scale of computation, data, and
models. However, efficiently training large-scale models on distributed systems requires an …

Save Cite Cited by 4 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs

M Sun, Z Yang, C Liao, Y Li, F Wu, Z Wang - arxiv preprint arxiv …, 2024 - arxiv.org

The recent progress made in large language models (LLMs) has brought tremendous
application prospects to the world. The growing model size demands LLM training on …

[Free GPT-4]

[PDF] arxiv.org

Uniap: Unifying inter-and intra-layer automatic parallelism by mixed integer quadratic programming

H Lin, K Wu, J Li, J Li, WJ Li - arxiv preprint arxiv:2307.16375, 2023 - arxiv.org

Distributed learning is commonly used for training deep learning models, especially large
models. In distributed learning, manual parallelism (MP) methods demand considerable …

Save Cite Cited by 1 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] zjujournals.com

Automatic parallelism strategy generation with minimal memory redundancy

Y Shi, P Liang, H Zheng, L Qiao, D Li - Frontiers of Information Technology …, 2024 - Springer

Large-scale deep learning models are trained distributedly due to memory and computing
resource limitations. Few existing strategy generation approaches take optimal memory …

Assessing Inference Time in Large Language Models

B Walkowiak, T Walkowiak - International Conference on Dependability of …, 2024 - Springer

Abstract Large Language Models have transformed the field of artificial intelligence, yet they
are often associated with elitism and inaccessibility. This is primarily due to the large number …

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

Colossal-auto: Unified automation of parallelization and activation checkpoint for large-scale...

Efficient large language models: A survey

Efficient training of large language models on distributed infrastructures: a survey

Towards Green AI: Current status and future research

AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost

LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs

Uniap: Unifying inter-and intra-layer automatic parallelism by mixed integer quadratic programming

Automatic parallelism strategy generation with minimal memory redundancy

Assessing Inference Time in Large Language Models