- Academic Search

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Save Cite Cited by 129 Related articles All 7 versions Free GPT-4 DeepSeek View as HTML

Performance enhancement of artificial intelligence: A survey

M Krichen, MS Abdalzaher - Journal of Network and Computer Applications, 2024 - Elsevier

The advent of machine learning (ML) and Artificial intelligence (AI) has brought about a
significant transformation across multiple industries, as it has facilitated the automation of …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] caidongqi.com

Resource-efficient algorithms and systems of foundation models: A survey

M Xu, D Cai, W Yin, S Wang, X **, X Liu - ACM Computing Surveys, 2025 - dl.acm.org

Large foundation models, including large language models, vision transformers, diffusion,
and large language model based multimodal models, are revolutionizing the entire machine …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of resource-efficient llm and multimodal foundation models

M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …

Save Cite Cited by 92 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

BPIPE: memory-balanced pipeline parallelism for training large language models

T Kim, H Kim, GI Yu, BG Chun - International Conference on …, 2023 - proceedings.mlr.press

Pipeline parallelism is a key technique for training large language models within GPU
clusters. However, it often leads to a memory imbalance problem, where certain GPUs face …

Save Cite Cited by 26 Related articles All 5 versions Free GPT-4 DeepSeek View as HTML

Enabling Parallelism Hot Switching for Efficient Training of Large Language Models

H Ge, F Fu, H Li, X Wang, S Lin, Y Wang, X Nie… - Proceedings of the …, 2024 - dl.acm.org

Training of large-scale deep learning models necessitates parallelizing the model and data
across numerous devices, and the choice of parallelism strategy substantially depends on …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Proteus: Simulating the performance of distributed DNN training

J Duan, X Li, P Xu, X Zhang, S Yan… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

DNN models are becoming increasingly larger to achieve unprecedented accuracy, and the
accompanying increased computation and memory requirements necessitate the …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mlsys.org

Does compressing activations help model parallel training?

S Bian, D Li, H Wang, E **ng… - … of Machine Learning …, 2024 - proceedings.mlsys.org

Foundation models have superior performance across a wide array of machine learning
tasks. The training of these models typically involves model parallelism (MP) to navigate the …

Save Cite Cited by 4 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training

S Li, Z Lai, Y Hao, W Liu, K Ge, X Deng, D Li… - arxiv preprint arxiv …, 2023 - arxiv.org

Deep learning is experiencing a rise in foundation models that are expected to lead in
various fields. The massive number of parameters necessitates the use of tensor model …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries

H Luo, H Wu, H Zhou, L **ng, Y Di, J Wang… - arxiv preprint arxiv …, 2025 - arxiv.org

Although deep models have been widely explored in solving partial differential equations
(PDEs), previous works are primarily limited to data only with up to tens of thousands of …

Create alert

Cite

Advanced search

Saved to My library

On optimizing the communication of model parallelism

Efficient large language models: A survey

Performance enhancement of artificial intelligence: A survey

Resource-efficient algorithms and systems of foundation models: A survey

A survey of resource-efficient llm and multimodal foundation models

BPIPE: memory-balanced pipeline parallelism for training large language models

Enabling Parallelism Hot Switching for Efficient Training of Large Language Models

Proteus: Simulating the performance of distributed DNN training

Does compressing activations help model parallel training?

Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training

Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries