- Academic Search

R Parizotto, BL Coelho, DC Nunes, I Haque… - ACM Computing …, 2023 - dl.acm.org

The demand for machine learning (ML) has increased significantly in recent decades,
enabling several applications, such as speech recognition, computer vision, and …

Enregistrer Citer Cité 20 fois Autres articles Les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] kaust.edu.sa

In-network aggregation with transport transparency for distributed training

S Liu, Q Wang, J Zhang, W Wu, Q Lin, Y Liu… - Proceedings of the 28th …, 2023 - dl.acm.org

Recent In-Network Aggregation (INA) solutions offload the all-reduce operation onto network
switches to accelerate and scale distributed training (DT). On end hosts, these solutions …

Enregistrer Citer Cité 28 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

NetFC: Enabling accurate floating-point arithmetic on programmable switches

P Cui, H Pan, Z Li, J Wu, S Zhang… - 2021 IEEE 29th …, 2021 - ieeexplore.ieee.org

Programmable switches are recently used for accelerating data-intensive distributed
applications. Some computational tasks, traditionally performed on servers in data centers …

Enregistrer Citer Cité 33 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

P4db-the case for in-network oltp

M Jasny, L Thostrup, T Ziegler, C Binnig - Proceedings of the 2022 …, 2022 - dl.acm.org

In this paper we present a new approach for distributed DBMSs called P4DB, that uses a
programmable switch to accelerate OLTP workloads. The main idea of P4DB is that it …

Enregistrer Citer Cité 22 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

A roadmap for big model

S Yuan, H Zhao, S Zhao, J Leng, Y Liang… - arxiv preprint arxiv …, 2022 - arxiv.org

With the rapid development of deep learning, training Big Models (BMs) for multiple
downstream tasks becomes a popular paradigm. Researchers have achieved various …

Enregistrer Citer Cité 21 fois Autres articles Les 2 versions Free GPT-4 DeepSeek En cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient data-plane memory scheduling for in-network aggregation

H Wang, Y Qin, CL Lao, Y Le, W Wu, K Chen - arxiv preprint arxiv …, 2022 - arxiv.org

As the scale of distributed training grows, communication becomes a bottleneck. To
accelerate the communication, recent works introduce In-Network Aggregation (INA), which …

Enregistrer Citer Cité 21 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Associative memory based experience replay for deep reinforcement learning

M Li, A Kazemi, AF Laguna, XS Hu - Proceedings of the 41st IEEE/ACM …, 2022 - dl.acm.org

Experience replay is an essential component in deep reinforcement learning (DRL), which
stores the experiences and generates experiences for the agent to learn in real time …

Enregistrer Citer Cité 14 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

Emotion detection in instagram social media platform

LJ Sailesh, VK Kumar, K Nimala… - 2023 International …, 2023 - ieeexplore.ieee.org

Depression is regarded as an important issue because it is the largest cause of disability
around the world and a major factor in the formation of serious medical conditions, which …

Enregistrer Citer Cité 8 fois Autres articles

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Preemptive switch memory usage to accelerate training jobs with shared in-network aggregation

H Wang, Y Qin, CL Lao, Y Le, W Wu… - 2023 IEEE 31st …, 2023 - ieeexplore.ieee.org

Recent works introduce In-Network Aggregation (INA) for distributed training (DT), which
moves the gradient summation into network programmable switches. INA can reduce the …

Enregistrer Citer Cité 5 fois Autres articles Les 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

NetReduce: RDMA-compatible in-network reduction for distributed DNN training acceleration

S Liu, Q Wang, J Zhang, Q Lin, Y Liu, M Xu… - arxiv preprint arxiv …, 2020 - arxiv.org

We present NetReduce, a novel RDMA-compatible in-network reduction architecture to
accelerate distributed DNN training. Compared to existing designs, NetReduce maintains a …

Enregistrer Citer Cité 14 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Accelerating distributed reinforcement learning with in-switch computing. In 2019 ACM/IEEE...

Offloading machine learning to programmable data planes: A systematic survey

In-network aggregation with transport transparency for distributed training

NetFC: Enabling accurate floating-point arithmetic on programmable switches

P4db-the case for in-network oltp

A roadmap for big model

Efficient data-plane memory scheduling for in-network aggregation

Associative memory based experience replay for deep reinforcement learning

Emotion detection in instagram social media platform

Preemptive switch memory usage to accelerate training jobs with shared in-network aggregation

NetReduce: RDMA-compatible in-network reduction for distributed DNN training acceleration