Communication-efficient distributed deep learning: A comprehensive survey
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
Lora-fa: Memory-efficient low-rank adaptation for large language models fine-tuning
The low-rank adaptation (LoRA) method can largely reduce the amount of trainable
parameters for fine-tuning large language models (LLMs), however, it still requires …
parameters for fine-tuning large language models (LLMs), however, it still requires …
Fusionai: Decentralized training and deploying llms with massive consumer-level gpus
The rapid growth of memory and computation requirements of large language models
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …
[HTML][HTML] Distributed Learning in Intelligent Transportation Systems: A Survey
The development of artificial intelligence (AI) and self-driving technology is expected to
enhance intelligent transportation systems (ITSs) by improving road safety and mobility …
enhance intelligent transportation systems (ITSs) by improving road safety and mobility …
Fusionllm: A decentralized llm training system on geo-distributed gpus with adaptive compression
To alleviate hardware scarcity in training large deep neural networks (DNNs), particularly
large language models (LLMs), we present FusionLLM, a decentralized training system …
large language models (LLMs), we present FusionLLM, a decentralized training system …
Sparse Gradient Communication with AlltoAll for Accelerating Distributed Deep Learning
Synchronous stochastic gradient descent (S-SGD) with data parallelism has become a de-
facto approach in training large-scale deep neural networks (DNNs) on multi-GPU systems …
facto approach in training large-scale deep neural networks (DNNs) on multi-GPU systems …
Preserving Near-Optimal Gradient Sparsification Cost for Scalable Distributed Deep Learning
Communication overhead is a major obstacle to scaling distributed training systems.
Gradient sparsification is a potential optimization approach to reduce the communication …
Gradient sparsification is a potential optimization approach to reduce the communication …
Near-Lossless Gradient Compression for Data-Parallel Distributed DNN Training
Data parallelism has become a cornerstone in scaling up the training of deep neural
networks (DNNs). However, the communication overhead associated with synchronizing …
networks (DNNs). However, the communication overhead associated with synchronizing …
FedSSA: Reducing Overhead of Additive Cryptographic Methods in Federated Learning With Sketch
Federated Learning (FL) has been applied across diverse domains as a powerful technique
but faces critical challenges in privacy protection. Secure aggregation and additive …
but faces critical challenges in privacy protection. Secure aggregation and additive …
Efficient Federated Learning Via Low-Rank Gradient Compression for Intelligent Transportation System
Q Li, X Ma, T **ao, Y Zhu, R Cai - 2024 Cross Strait Radio …, 2024 - ieeexplore.ieee.org
Within the realm of intelligent transportation systems, be it for autonomous vehicles or other
applications, the window of opportunity for executing distributed learning is constrained …
applications, the window of opportunity for executing distributed learning is constrained …