Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Efficient training of large language models on distributed infrastructures: a survey

J Duan, S Zhang, Z Wang, L Jiang, W Qu, Q Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with
their sophisticated capabilities. Training these models requires vast GPU clusters and …

Towards Green AI: Current status and future research

C Clemm, L Stobbe, K Wimalawarne… - … Goes Green 2024+ …, 2024 - ieeexplore.ieee.org
The immense technological progress in artificial intelligence research and applications is
increasingly drawing attention to the environmental sustainability of such systems, a field …

AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost

J Chen, S Li, R Guo, J Yuan… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recent advances in deep learning are driven by the growing scale of computation, data, and
models. However, efficiently training large-scale models on distributed systems requires an …

LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs

M Sun, Z Yang, C Liao, Y Li, F Wu, Z Wang - arxiv preprint arxiv …, 2024 - arxiv.org
The recent progress made in large language models (LLMs) has brought tremendous
application prospects to the world. The growing model size demands LLM training on …

Uniap: Unifying inter-and intra-layer automatic parallelism by mixed integer quadratic programming

H Lin, K Wu, J Li, J Li, WJ Li - arxiv preprint arxiv:2307.16375, 2023 - arxiv.org
Distributed learning is commonly used for training deep learning models, especially large
models. In distributed learning, manual parallelism (MP) methods demand considerable …

Automatic parallelism strategy generation with minimal memory redundancy

Y Shi, P Liang, H Zheng, L Qiao, D Li - Frontiers of Information Technology …, 2024 - Springer
Large-scale deep learning models are trained distributedly due to memory and computing
resource limitations. Few existing strategy generation approaches take optimal memory …

Assessing Inference Time in Large Language Models

B Walkowiak, T Walkowiak - International Conference on Dependability of …, 2024 - Springer
Abstract Large Language Models have transformed the field of artificial intelligence, yet they
are often associated with elitism and inaccessibility. This is primarily due to the large number …