Google znalac

HW Tseng, Q Zhao, Y Zhou, M Gahagan… - ACM SIGARCH …, 2016 - dl.acm.org

In high performance computing systems, object deserialization can become a surprisingly
important bottleneck---in our test, a set of general-purpose, highly parallelized applications …

Spremi Citiraj Spominje se 90 puta Srodni članci Svih 11 inačica

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Multi-gpu communication schemes for iterative solvers: When cpus are not in charge

I Ismayilov, J Baydamirli, D Sağbili, M Wahib… - Proceedings of the 37th …, 2023 - dl.acm.org

This paper proposes a fully autonomous execution model for multi-GPU applications that
completely excludes the involvement of the CPU beyond the initial kernel launch. In a typical …

Spremi Citiraj Spominje se 12 puta Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The landscape of gpu-centric communication

D Unat, I Turimbetov, MKT Issa, D Sağbili… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, GPUs have become the preferred accelerators for HPC and ML applications
due to their parallelism and fast memory bandwidth. While GPUs boost computation, inter …

Spremi Citiraj Spominje se 1 puta Srodni članci Svih 3 inačica Prikaži kao HTML

Network endpoint congestion control for fine-grained communication

N Jiang, L Dennison, WJ Dally - … of the International Conference for High …, 2015 - dl.acm.org

Endpoint congestion in HPC networks creates tree saturation that is detrimental to
performance. Endpoint congestion can be alleviated by reducing the injection rate of traffic …

Spremi Citiraj Spominje se 61 puta Srodni članci Svih 3 inačica

[Free GPT-4]
[DeepSeek]

[PDF] ethz.ch

dCUDA: hardware supported overlap of computation and communication

T Gysi, J Bär, T Hoefler - SC'16: Proceedings of the …, 2016 - ieeexplore.ieee.org

Over the last decade, CUDA and the underlying GPU hardware architecture have
continuously gained popularity in various high-performance computing application domains …

Spremi Citiraj Spominje se 38 puta Srodni članci Svih 32 inačica

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

L Oden, H Fröning - The International Journal of High …, 2017 - journals.sagepub.com

Due to their massive parallelism and high performance per Watt, GPUs have gained high
popularity in high-performance computing and are a strong candidate for future exascale …

Spremi Citiraj Spominje se 45 puta Srodni članci Svih 10 inačica

[Free GPT-4]
[DeepSeek]

[PDF] mlebeane.com

Gpu initiated openshmem: correct and efficient intra-kernel networking for dgpus

K Hamidouche, M LeBeane - Proceedings of the 25th ACM SIGPLAN …, 2020 - dl.acm.org

Current state-of-the-art in GPU networking utilizes a host-centric, kernel-boundary
communication model that reduces performance and increases code complexity. To address …

Spremi Citiraj Spominje se 20 puta Srodni članci Svih 4 inačica

Exploiting gpudirect rdma in designing high performance openshmem for nvidia gpu clusters

K Hamidouche, A Venkatesh, AA Awan… - 2015 IEEE …, 2015 - ieeexplore.ieee.org

GPUDirect RDMA (GDR) brings the high-performance communication capabilities of RDMA
networks like InfiniBand (IB) to GPUs (referred to as" Device"). It enables IB network …

Spremi Citiraj Spominje se 31 puta Srodni članci Svih 4 inačica

[Free GPT-4]
[DeepSeek]

[PDF] nvidia.com

Relaxations for high-performance message passing on massively parallel SIMT processors

B Klenk, H Fröening, H Eberle… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org

Accelerators, such as GPUs, have proven to be highly successful in reducing execution time
and power consumption of compute-intensive applications. Even though they are already …

Spremi Citiraj Spominje se 31 puta Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

GPU triggered networking for intra-kernel communications

M LeBeane, K Hamidouche, B Benton… - Proceedings of the …, 2017 - dl.acm.org

GPUs are widespread across clusters of compute nodes due to their attractive performance
for data parallel codes. However, communicating between GPUs across the cluster is …

Spremi Citiraj Spominje se 26 puta Srodni članci Svih 9 inačica

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

GGAS: Global GPU address spaces for efficient communication in heterogeneous clusters

Morpheus: Creating application objects efficiently for heterogeneous computing

Multi-gpu communication schemes for iterative solvers: When cpus are not in charge

The landscape of gpu-centric communication

Network endpoint congestion control for fine-grained communication

dCUDA: hardware supported overlap of computation and communication

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

Gpu initiated openshmem: correct and efficient intra-kernel networking for dgpus

Exploiting gpudirect rdma in designing high performance openshmem for nvidia gpu clusters

Relaxations for high-performance message passing on massively parallel SIMT processors

GPU triggered networking for intra-kernel communications