Μελετητής Google

Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results

T Hoefler, R Belli - Proceedings of the international conference for high …, 2015 - dl.acm.org

Measuring and reporting performance of parallel computers constitutes the basis for
scientific advancement of high-performance computing (HPC). Most scientific reports show …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 332 Σχετικά άρθρα Όλες οι 37 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] osti.gov

There goes the neighborhood: performance degradation due to nearby jobs

A Bhatele, K Mohror, SH Langer… - Proceedings of the …, 2013 - dl.acm.org

Predictable performance is important for understanding and alleviating application
performance issues; quantifying the effects of source code, compiler, or system software …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 260 Σχετικά άρθρα Όλες οι 17 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] ethz.ch

Using automated performance modeling to find scalability bugs in complex codes

A Calotoiu, T Hoefler, M Poke, F Wolf - Proceedings of the International …, 2013 - dl.acm.org

Many parallel applications suffer from latent performance limitations that may prevent them
from scaling to larger machine sizes. Often, such scalability bugs manifest themselves only …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 196 Σχετικά άρθρα Όλες οι 27 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Clairvoyant prefetching for distributed machine learning I/O

N Dryden, R Böhringer, T Ben-Nun… - Proceedings of the …, 2021 - dl.acm.org

I/O is emerging as a major bottleneck for machine learning training, especially in distributed
environments. Indeed, at large scale, I/O takes as much as 85% of training time. Addressing …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 66 Σχετικά άρθρα Όλες οι 22 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gossipgrad: Scalable deep learning using gossip communication based asynchronous gradient descent

J Daily, A Vishnu, C Siegel, T Warfel… - arxiv preprint arxiv …, 2018 - arxiv.org

In this paper, we present GossipGraD-a gossip communication protocol based Stochastic
Gradient Descent (SGD) algorithm for scaling Deep Learning (DL) algorithms on large-scale …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 114 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Flare: Flexible in-network allreduce

D De Sensi, S Di Girolamo, S Ashkboos, S Li… - Proceedings of the …, 2021 - dl.acm.org

The allreduce operation is one of the most commonly used communication routines in
distributed applications. To improve its bandwidth and to reduce network traffic, this …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 49 Σχετικά άρθρα Όλες οι 27 εκδοχές

The SIMNET virtual world architecture

J Calvin, A Dickens, B Gaines… - Proceedings of IEEE …, 1993 - ieeexplore.ieee.org

Many tools and techniques have been developed to address specific aspects of interacting
in a virtual world. Few have been designed with an architecture that allows large numbers of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 301 Σχετικά άρθρα Όλες οι 6 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

sPIN: High-performance streaming Processing in the Network

T Hoefler, S Di Girolamo, K Taranov, RE Grant… - Proceedings of the …, 2017 - dl.acm.org

Optimizing communication performance is imperative for large-scale computing because
communication overheads limit the strong scalability of parallel applications. Today's …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 104 Σχετικά άρθρα Όλες οι 27 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] kuleuven.be

Hiding global communication latency in the GMRES algorithm on massively parallel machines

P Ghysels, TJ Ashby, K Meerbergen… - SIAM journal on scientific …, 2013 - SIAM

In the generalized minimal residual method (GMRES), the global all-to-all communication
required in each iteration for orthogonalization and normalization of the Krylov base vectors …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 172 Σχετικά άρθρα Όλες οι 11 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Run-to-run variability on Xeon Phi based Cray XC systems

S Chunduri, K Harms, S Parker, V Morozov… - Proceedings of the …, 2017 - dl.acm.org

The increasing complexity of HPC systems has introduced new sources of variability, which
can contribute to significant differences in run-to-run performance of applications. With …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 100 Σχετικά άρθρα Όλες οι 5 εκδοχές

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Characterizing the influence of system noise on large-scale applications by simulation

Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results

There goes the neighborhood: performance degradation due to nearby jobs

Using automated performance modeling to find scalability bugs in complex codes

Clairvoyant prefetching for distributed machine learning I/O

Gossipgrad: Scalable deep learning using gossip communication based asynchronous gradient descent

Flare: Flexible in-network allreduce

The SIMNET virtual world architecture

sPIN: High-performance streaming Processing in the Network

Hiding global communication latency in the GMRES algorithm on massively parallel machines

Run-to-run variability on Xeon Phi based Cray XC systems