Efficient large language models: A survey
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …
tasks such as natural language understanding and language generation, and thus have the …
Performance enhancement of artificial intelligence: A survey
The advent of machine learning (ML) and Artificial intelligence (AI) has brought about a
significant transformation across multiple industries, as it has facilitated the automation of …
significant transformation across multiple industries, as it has facilitated the automation of …
Resource-efficient algorithms and systems of foundation models: A survey
Large foundation models, including large language models, vision transformers, diffusion,
and large language model based multimodal models, are revolutionizing the entire machine …
and large language model based multimodal models, are revolutionizing the entire machine …
A survey of resource-efficient llm and multimodal foundation models
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
BPIPE: memory-balanced pipeline parallelism for training large language models
Pipeline parallelism is a key technique for training large language models within GPU
clusters. However, it often leads to a memory imbalance problem, where certain GPUs face …
clusters. However, it often leads to a memory imbalance problem, where certain GPUs face …
Enabling Parallelism Hot Switching for Efficient Training of Large Language Models
Training of large-scale deep learning models necessitates parallelizing the model and data
across numerous devices, and the choice of parallelism strategy substantially depends on …
across numerous devices, and the choice of parallelism strategy substantially depends on …
Proteus: Simulating the performance of distributed DNN training
DNN models are becoming increasingly larger to achieve unprecedented accuracy, and the
accompanying increased computation and memory requirements necessitate the …
accompanying increased computation and memory requirements necessitate the …
Does compressing activations help model parallel training?
Foundation models have superior performance across a wide array of machine learning
tasks. The training of these models typically involves model parallelism (MP) to navigate the …
tasks. The training of these models typically involves model parallelism (MP) to navigate the …
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
Deep learning is experiencing a rise in foundation models that are expected to lead in
various fields. The massive number of parameters necessitates the use of tensor model …
various fields. The massive number of parameters necessitates the use of tensor model …
Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries
Although deep models have been widely explored in solving partial differential equations
(PDEs), previous works are primarily limited to data only with up to tens of thousands of …
(PDEs), previous works are primarily limited to data only with up to tens of thousands of …