- Academic Search

A Li, SL Song, J Chen, J Li, X Liu… - … on Parallel and …, 2019 - ieeexplore.ieee.org

High performance multi-GPU computing becomes an inevitable trend due to the ever-
increasing demand on computation capability in emerging domains such as deep learning …

Lagre Referanse Sitert av 298 Beslektede artikler Alle 10 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Sv-sim: scalable pgas-based state vector simulation of quantum circuits

A Li, B Fang, C Granade, G Prawiroatmodjo… - Proceedings of the …, 2021 - dl.acm.org

High-performance quantum circuit simulation in a classic HPC is still imperative in the NISQ
era. Observing that the major obstacle of scalable state-vector quantum simulation arises …

Lagre Referanse Sitert av 39 Beslektede artikler Alle 4 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Apnn-tc: Accelerating arbitrary precision neural networks on ampere gpu tensor cores

B Feng, Y Wang, T Geng, A Li, Y Ding - Proceedings of the international …, 2021 - dl.acm.org

Over the years, accelerating neural networks with quantization has been widely studied.
Unfortunately, prior efforts with diverse precisions (eg, 1-bit weights and 2-bit activations) are …

Lagre Referanse Sitert av 43 Beslektede artikler Alle 7 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] siue.edu

Density matrix quantum circuit simulation via the BSP machine on modern GPU clusters

A Li, O Subasi, X Yang… - … conference for high …, 2020 - ieeexplore.ieee.org

As quantum computers evolve, simulations of quantum programs on classical computers will
be essential in validating quantum algorithms, understanding the effect of system noise, and …

Lagre Referanse Sitert av 44 Beslektede artikler Alle 7 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite

A Li, SL Song, J Chen, X Liu, N Tallent… - 2018 IEEE …, 2018 - ieeexplore.ieee.org

High performance multi-GPU computing becomes an inevitable trend due to the ever-
increasing demand on computation capability in emerging domains such as deep learning …

Lagre Referanse Sitert av 69 Beslektede artikler Alle 4 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Register optimizations for stencils on GPUs

PS Rawat, F Rastello, A Sukumaran-Rajam… - Proceedings of the 23rd …, 2018 - dl.acm.org

The recent advent of compute-intensive GPU architecture has allowed application
developers to explore high-order 3D stencils for better computational accuracy. A common …

Lagre Referanse Sitert av 70 Beslektede artikler Alle 8 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Accelerating binarized neural networks via bit-tensor-cores in turing gpus

A Li, S Su - IEEE Transactions on Parallel and Distributed …, 2020 - ieeexplore.ieee.org

Despite foreseeing tremendous speedups over conventional deep neural networks, the
performance advantage of binarized neural networks (BNNs) has merely been showcased …

Lagre Referanse Sitert av 39 Beslektede artikler Alle 6 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

BSTC: A novel binarized-soft-tensor-core design for accelerating bit-based approximated neural nets

A Li, T Geng, T Wang, M Herbordt, SL Song… - Proceedings of the …, 2019 - dl.acm.org

Binarized neural networks (or BNNs) promise tremendous performance improvement over
traditional DNNs through simplified bit-level computation and significantly reduced memory …

Lagre Referanse Sitert av 47 Beslektede artikler Alle 4 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Mapa: Multi-accelerator pattern allocation policy for multi-tenant gpu servers

K Ranganath, JD Suetterlein, JB Manzano… - Proceedings of the …, 2021 - dl.acm.org

Multi-accelerator servers are increasingly being deployed in shared multi-tenant
environments (such as in cloud data centers) in order to meet the demands of large-scale …

Lagre Referanse Sitert av 23 Beslektede artikler Alle 8 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] ssslab.cn

Adaptive auto-tuning framework for global exploration of stencil optimization on gpus

Q Sun, Y Liu, H Yang, Z Jiang, Z Luan… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Stencil computations are widely used in high performance computing (HPC) applications.
Many HPC platforms utilize the high computation capability of GPUs to accelerate stencil …

Lagre Referanse Sitert av 5 Beslektede artikler Alle 4 versjoner

Opprett varsel

Referanse

Avansert søk

Lagret i Mitt bibliotek

Critical points based register-concurrency autotuning for GPUs

Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect

Sv-sim: scalable pgas-based state vector simulation of quantum circuits

Apnn-tc: Accelerating arbitrary precision neural networks on ampere gpu tensor cores

Density matrix quantum circuit simulation via the BSP machine on modern GPU clusters

Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite

Register optimizations for stencils on GPUs

Accelerating binarized neural networks via bit-tensor-cores in turing gpus

BSTC: A novel binarized-soft-tensor-core design for accelerating bit-based approximated neural nets

Mapa: Multi-accelerator pattern allocation policy for multi-tenant gpu servers

Adaptive auto-tuning framework for global exploration of stencil optimization on gpus