- Academic Search

Domain-specific multi-level IR rewriting for GPU: The Open Earth compiler for GPU-accelerated climate simulation

T Gysi, C Müller, O Zinenko, S Herhut, E Davis… - ACM Transactions on …, 2021 - dl.acm.org

Most compilers have a single core intermediate representation (IR)(eg, LLVM) sometimes
complemented with vaguely defined IR-like data structures. This IR is commonly low-level …

Tallenna Viittaa Viittausten määrä 61 Aiheeseen liittyviä artikkeleita Kaikki 22 versiota

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

AN5D: automated stencil framework for high-degree temporal blocking on GPUs

K Matsumura, HR Zohouri, M Wahib, T Endo… - Proceedings of the 18th …, 2020 - dl.acm.org

Stencil computation is one of the most widely-used compute patterns in high performance
computing applications. Spatial and temporal blocking have been proposed to overcome the …

Tallenna Viittaa Viittausten määrä 65 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Efficient simulation execution of cellular automata on GPU

D Cagigas-Muñiz, F Diaz-del-Rio… - … Modelling Practice and …, 2022 - Elsevier

Abstract Graphics Processing Units (GPUs) can be used as convenient hardware
accelerators to speed up Cellular Automata (CA) simulations, which are employed in many …

Tallenna Viittaa Viittausten määrä 22 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

Toward accelerated stencil computation by adapting tensor core unit on gpu

X Liu, Y Liu, H Yang, J Liao, M Li, Z Luan… - Proceedings of the 36th …, 2022 - dl.acm.org

The Tensor Core Unit (TCU) has been increasingly adopted on modern high performance
processors, specialized in boosting the performance of general matrix multiplication …

Tallenna Viittaa Viittausten määrä 21 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay

K Parasyris, G Georgakoudis, E Rangel… - Proceedings of the …, 2023 - dl.acm.org

HPC is a heterogeneous world in which host and device code are interleaved throughout
the application. Given the significant performance advantage of accelerators, device code …

Tallenna Viittaa Viittausten määrä 9 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A μ-mode integrator for solving evolution equations in Kronecker form

M Caliari, F Cassini, L Einkemmer, A Ostermann… - Journal of …, 2022 - Elsevier

In this paper, we propose a μ-mode integrator for computing the solution of stiff evolution
equations. The integrator is based on a d-dimensional splitting approach and uses exact …

Tallenna Viittaa Viittausten määrä 23 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota

[Free GPT-4]
[DeepSeek]

[PDF] google.com

On optimizing complex stencils on GPUs

PS Rawat, M Vaidya… - 2019 IEEE …, 2019 - ieeexplore.ieee.org

Stencil computations are often the compute-intensive kernel in many scientific applications.
With the increasing demand for computational accuracy, and the emergence of massively …

Tallenna Viittaa Viittausten määrä 42 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A versatile software systolic execution model for GPU memory-bound kernels

P Chen, M Wahib, S Takizawa, R Takano… - Proceedings of the …, 2019 - dl.acm.org

This paper proposes a versatile high-performance execution model, inspired by systolic
arrays, for memory-bound regular kernels running on CUDA-enabled GPUs. We formulate a …

Tallenna Viittaa Viittausten määrä 29 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota

Automated code generation of high-order stencils for a dataflow architecture

R Sai, J Mellor-Crummey, J Xu… - … Conference for High …, 2024 - ieeexplore.ieee.org

Finite-difference methods based on high-order stencils are widely used in seismic
simulations, weather forecasting, and computational fluid dynamics. Recently, multiple …

Tallenna Viittaa Viittausten määrä 1 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Accelerating high-order stencils on GPUs

R Sai, J Mellor-Crummey, X Meng… - 2020 IEEE/ACM …, 2020 - ieeexplore.ieee.org

While implementation strategies for low-order stencils on GPUs have been well-studied in
the literature, not all of the techniques work well for high-order stencils, such as those used …

Tallenna Viittaa Viittausten määrä 22 Aiheeseen liittyviä artikkeleita Kaikki 10 versiota

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Domain-specific optimization and generation of high-performance GPU code for stencil computations

Domain-specific multi-level IR rewriting for GPU: The Open Earth compiler for GPU-accelerated climate simulation

AN5D: automated stencil framework for high-degree temporal blocking on GPUs

[HTML][HTML] Efficient simulation execution of cellular automata on GPU

Toward accelerated stencil computation by adapting tensor core unit on gpu

Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay

A μ-mode integrator for solving evolution equations in Kronecker form

On optimizing complex stencils on GPUs

A versatile software systolic execution model for GPU memory-bound kernels

Automated code generation of high-order stencils for a dataflow architecture

Accelerating high-order stencils on GPUs