Google Académico

The evolution of distributed systems for graph neural networks and their origin in graph processing and deep learning: A survey

J Vatter, R Mayer, HA Jacobsen - ACM Computing Surveys, 2023 - dl.acm.org

Graph neural networks (GNNs) are an emerging research field. This specialized deep
neural network architecture is capable of processing graph structured data and bridges the …

Guardar Citar Citado por 210 Artículos relacionados Las 7 versiones

[Free GPT-4]

[PDF] acm.org

Efficient sparse collective communication and its application to accelerate distributed deep learning

J Fei, CY Ho, AN Sahu, M Canini, A Sapio - Proceedings of the 2021 …, 2021 - dl.acm.org

Efficient collective communication is crucial to parallel-computing applications such as
distributed training of large-scale recommendation systems and natural language …

Guardar Citar Citado por 103 Artículos relacionados Las 7 versiones

[Free GPT-4]

[PDF] usenix.org

Unlocking the power of inline {Floating-Point} operations on programmable switches

Y Yuan, O Alama, J Fei, J Nelson, DRK Ports… - … USENIX Symposium on …, 2022 - usenix.org

The advent of switches with programmable dataplanes has enabled the rapid development
of new network functionality, as well as providing a platform for acceleration of a broad …

Guardar Citar Citado por 46 Artículos relacionados Las 12 versiones Versión en HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Distributed artificial intelligence: Taxonomy, review, framework, and reference architecture

N Janbi, I Katib, R Mehmood - Intelligent Systems with Applications, 2023 - Elsevier

Artificial intelligence (AI) research and market have grown rapidly in the last few years, and
this trend is expected to continue with many potential advancements and innovations in this …

Guardar Citar Citado por 10 Artículos relacionados

[Free GPT-4]

[PDF] arxiv.org

Time-correlated sparsification for communication-efficient federated learning

E Ozfatura, K Ozfatura, D Gündüz - 2021 IEEE International …, 2021 - ieeexplore.ieee.org

Federated learning (FL) enables multiple clients to collaboratively train a shared model, with
the help of a parameter server (PS), without disclosing their local datasets. However, due to …

Guardar Citar Citado por 68 Artículos relacionados Las 7 versiones

[Free GPT-4]

[PDF] acm.org

Gemini: Fast failure recovery in distributed training with in-memory checkpoints

Z Wang, Z Jia, S Zheng, Z Zhang, X Fu… - Proceedings of the 29th …, 2023 - dl.acm.org

Large deep learning models have recently garnered substantial attention from both
academia and industry. Nonetheless, frequent failures are observed during large model …

Guardar Citar Citado por 50 Artículos relacionados Las 6 versiones

[Free GPT-4]

[PDF] acm.org

Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications

F Strati, X Ma, A Klimovic - … of the Nineteenth European Conference on …, 2024 - dl.acm.org

GPUs are critical for maximizing the throughput-per-Watt of deep neural network (DNN)
applications. However, DNN applications often underutilize GPUs, even when using large …

Guardar Citar Citado por 29 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] usenix.org

{PipeSwitch}: Fast pipelined context switching for deep learning applications

Z Bai, Z Zhang, Y Zhu, X ** - 14th USENIX Symposium on Operating …, 2020 - usenix.org

Deep learning (DL) workloads include throughput-intensive training tasks and latency-
sensitive inference tasks. The dominant practice today is to provision dedicated GPU …

Guardar Citar Citado por 139 Artículos relacionados Las 13 versiones Versión en HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices

JÁ Morell, E Alba - Future Generation Computer Systems, 2022 - Elsevier

The number of devices, from smartphones to IoT hardware, interconnected via the Internet is
growing all the time. These devices produce a large amount of data that cannot be analyzed …

Guardar Citar Citado por 29 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] mlsys.org

On the utility of gradient compression in distributed training systems

S Agarwal, H Wang, S Venkataraman… - Proceedings of …, 2022 - proceedings.mlsys.org

A rich body of prior work has highlighted the existence of communication bottlenecks in
synchronous data-parallel training. To alleviate these bottlenecks, a long line of recent …

Guardar Citar Citado por 53 Artículos relacionados Las 3 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Is network the bottleneck of distributed training?

The evolution of distributed systems for graph neural networks and their origin in graph processing and deep learning: A survey

Efficient sparse collective communication and its application to accelerate distributed deep learning

Unlocking the power of inline {Floating-Point} operations on programmable switches

[HTML][HTML] Distributed artificial intelligence: Taxonomy, review, framework, and reference architecture

Time-correlated sparsification for communication-efficient federated learning

Gemini: Fast failure recovery in distributed training with in-memory checkpoints

Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications

{PipeSwitch}: Fast pipelined context switching for deep learning applications

[HTML][HTML] Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices

On the utility of gradient compression in distributed training systems