Google Académico

Sancus: staleness-aware communication-avoiding full-graph decentralized training in large-scale graph neural networks

J Peng, Z Chen, Y Shao, Y Shen, L Chen… - Proceedings of the VLDB …, 2022 - dl.acm.org

Graph neural networks (GNNs) have emerged due to their success at modeling graph data.
Yet, it is challenging for GNNs to efficiently scale to large graphs. Thus, distributed GNNs …

Guardar Citar Citado por 74 Artículos relacionados Las 6 versiones

[Free GPT-4]

[PDF] acm.org

Spotserve: Serving generative large language models on preemptible instances

X Miao, C Shi, J Duan, X **, D Lin, B Cui… - Proceedings of the 29th …, 2024 - dl.acm.org

The high computational and memory requirements of generative large language models
(LLMs) make it challenging to serve them cheaply. This paper aims to reduce the monetary …

Guardar Citar Citado por 48 Artículos relacionados Las 4 versiones

[Free GPT-4]

[PDF] arxiv.org

Blindfl: Vertical federated machine learning without peeking into your data

F Fu, H Xue, Y Cheng, Y Tao, B Cui - Proceedings of the 2022 …, 2022 - dl.acm.org

Due to the rising concerns on privacy protection, how to build machine learning (ML) models
over different data sources with security guarantees is gaining more popularity. Vertical …

Guardar Citar Citado por 55 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] arxiv.org

Galvatron: Efficient transformer training over multiple gpus using automatic parallelism

X Miao, Y Wang, Y Jiang, C Shi, X Nie, H Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org

Transformer models have achieved state-of-the-art performance on various domains of
applications and gradually becomes the foundations of the advanced large deep learning …

Guardar Citar Citado por 56 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

HET: scaling out huge embedding model training via cache-enabled distributed framework

X Miao, H Zhang, Y Shi, X Nie, Z Yang, Y Tao… - arxiv preprint arxiv …, 2021 - arxiv.org

Embedding models have been an effective learning paradigm for high-dimensional data.
However, one open issue of embedding models is that their representations (latent factors) …

Guardar Citar Citado por 55 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Flash-llm: Enabling cost-effective and highly-efficient large generative model inference with unstructured sparsity

H **a, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu… - arxiv preprint arxiv …, 2023 - arxiv.org

With the fast growth of parameter size, it becomes increasingly challenging to deploy large
generative models as they typically require large GPU memory consumption and massive …

Guardar Citar Citado por 47 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] acm.org

Distributed Machine Learning in Edge Computing: Challenges, Solutions and Future Directions

J Tu, L Yang, J Cao - ACM Computing Surveys, 2024 - dl.acm.org

Distributed machine learning on edges is widely used in intelligent transportation, smart
home, industrial manufacturing, and underground pipe network monitoring to achieve low …

Guardar Citar Artículos relacionados

[Free GPT-4]

[PDF] arxiv.org

Dear: Accelerating distributed deep learning with fine-grained all-reduce pipelining

L Zhang, S Shi, X Chu, W Wang, B Li… - 2023 IEEE 43rd …, 2023 - ieeexplore.ieee.org

Communication scheduling has been shown to be effective in accelerating distributed
training, which enables all-reduce communications to be overlapped with backpropagation …

Guardar Citar Citado por 15 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] vldb.org

Sdpipe: A semi-decentralized framework for heterogeneity-aware pipeline-parallel training

X Miao, Y Shi, Z Yang, B Cui, Z Jia - Proceedings of the VLDB …, 2023 - dl.acm.org

The increasing size of both deep learning models and training data necessitates the ability
to scale out model training through pipeline-parallel training, which combines pipelined …

Guardar Citar Citado por 13 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] archive.org

HET-GMP: A graph-based system approach to scaling large embedding model training

X Miao, Y Shi, H Zhang, X Zhang, X Nie… - Proceedings of the …, 2022 - dl.acm.org

Embedding models have been recognized as an effective learning paradigm for high-
dimensional data. However, a major embedding model training obstacle is that updating …

Guardar Citar Citado por 19 Artículos relacionados Las 2 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Heterogeneity-aware distributed machine learning training via partial reduce

Sancus: staleness-aware communication-avoiding full-graph decentralized training in large-scale graph neural networks

Spotserve: Serving generative large language models on preemptible instances

Blindfl: Vertical federated machine learning without peeking into your data

Galvatron: Efficient transformer training over multiple gpus using automatic parallelism

HET: scaling out huge embedding model training via cache-enabled distributed framework

Flash-llm: Enabling cost-effective and highly-efficient large generative model inference with unstructured sparsity

Distributed Machine Learning in Edge Computing: Challenges, Solutions and Future Directions

Dear: Accelerating distributed deep learning with fine-grained all-reduce pipelining

Sdpipe: A semi-decentralized framework for heterogeneity-aware pipeline-parallel training

HET-GMP: A graph-based system approach to scaling large embedding model training