Μελετητής Google

DM Jennewein, J Lee, C Kurtz, W Dizon… - … and Experience in …, 2023 - dl.acm.org

The Sol supercomputer provides ASU researchers access to a state-of-the-art system with
an observed GPU-only HPL speed of 2.272 PetaFLOP/s. This short paper provides a …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 37 Σχετικά άρθρα Όλες οι 4 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Swarm parallelism: Training large models can be surprisingly communication-efficient

M Ryabinin, T Dettmers, M Diskin… - … on Machine Learning, 2023 - proceedings.mlr.press

Many deep learning applications benefit from using large models with billions of parameters.
Training these models is notoriously expensive due to the need for specialized HPC …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 27 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Short reasons for long vectors in HPC CPUs: a study based on RISC-V

P Vizcaino, G Ieronymakis, N Dimou… - Proceedings of the SC' …, 2023 - dl.acm.org

For years, SIMD/vector units have enhanced the capabilities of modern CPUs in High-
Performance Computing (HPC) and mobile technology. Typical commercially-available …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 12 Σχετικά άρθρα Όλες οι 5 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Method for scalable and performant GPU-accelerated simulation of multiphase compressible flow

A Radhakrishnan, H Le Berre, B Wilfong… - Computer Physics …, 2024 - Elsevier

Multiphase compressible flows are often characterized by a broad range of space and time
scales, entailing large grids and small time steps. Simulations of these flows on CPU-based …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 6 Σχετικά άρθρα Όλες οι 7 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] lbl.gov

A case study of porting HPGMG from CUDA to OpenMP target offload

C Daley, H Ahmed, S Williams, N Wright - OpenMP: Portable Multi-Level …, 2020 - Springer

The HPGMG benchmark is a non-trivial Multigrid benchmark used to evaluate system
performance. We ported this benchmark from CUDA to OpenMP target offload and added …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 24 Σχετικά άρθρα Όλες οι 4 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Application experiences on a GPU-accelerated Arm-based HPC testbed

W Elwasif, W Godoy, N Hagerty, JA Harris… - Proceedings of the …, 2023 - dl.acm.org

This paper assesses and reports the experience of ten teams working to port, validate, and
benchmark several High Performance Computing applications on a novel GPU-accelerated …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 9 Σχετικά άρθρα Όλες οι 12 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The specialized high-performance network on anton 3

KS Shim, B Greskamp, B Towles… - … Symposium on High …, 2022 - ieeexplore.ieee.org

Molecular dynamics (MD) simulation, a computationally intensive method that provides
invaluable insights into the behavior of biomolecules, typically requires large-scale …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 10 Σχετικά άρθρα Όλες οι 4 εκδοχές

On the Performance Investigation of a Recursive Fast Optical Switch-Based High Performance Computing Network Architecture

F Yan, X Deng, C Yuan, B Yan… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

We propose a novel high performance computing (HPC) network architecture based on
parallel levels distributed low radix fast optical switches (FOS). We provide a detailed …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 4 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators

Y Fridman, G Tamir, G Oren - International Conference on High …, 2023 - Springer

Over the last decade, most of the increase in computing power has been gained by
advances in accelerated many-core architectures, mainly in the form of GPGPUs. While …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 4 Σχετικά άρθρα Όλες οι 8 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring fully offloaded gpu stream-aware message passing

N Namashivayam, K Kandalla, JB White III… - arxiv preprint arxiv …, 2023 - arxiv.org

Modern heterogeneous supercomputing systems are comprised of CPUs, GPUs, and high-
speed network interconnects. Communication libraries supporting efficient data transfers …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Scaling the Summit: deploying the World’s fastest supercomputer

The sol supercomputer at arizona state university

Swarm parallelism: Training large models can be surprisingly communication-efficient

Short reasons for long vectors in HPC CPUs: a study based on RISC-V

Method for scalable and performant GPU-accelerated simulation of multiphase compressible flow

A case study of porting HPGMG from CUDA to OpenMP target offload

Application experiences on a GPU-accelerated Arm-based HPC testbed

The specialized high-performance network on anton 3

On the Performance Investigation of a Recursive Fast Optical Switch-Based High Performance Computing Network Architecture

Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators

Exploring fully offloaded gpu stream-aware message passing