- Academic Search

R Baghdadi, J Ray, MB Romdhane… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org

This paper introduces Tiramisu, a polyhedral framework designed to generate high
performance code for multiple platforms including multicores, GPUs, and distributed …

Tallenna Viittaa Viittausten määrä 396 Aiheeseen liittyviä artikkeleita Kaikki 14 versiota

[Free GPT-4]
[DeepSeek]

[PDF] cam.ac.uk

Green-Marl: a DSL for easy and efficient graph analysis

S Hong, H Chafi, E Sedlar, K Olukotun - Proceedings of the seventeenth …, 2012 - dl.acm.org

The increasing importance of graph-data based applications is fueling the need for highly
efficient and parallel implementations of graph analysis software. In this paper we describe …

Tallenna Viittaa Viittausten määrä 412 Aiheeseen liittyviä artikkeleita Kaikki 14 versiota

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Delite: A compiler architecture for performance-oriented embedded domain-specific languages

AK Sujeeth, KJ Brown, H Lee, T Rompf… - ACM Transactions on …, 2014 - dl.acm.org

Develo** high-performance software is a difficult task that requires the use of low-level,
architecture-specific programming models (eg, OpenMP for CMPs, CUDA for GPUs, MPI for …

Tallenna Viittaa Viittausten määrä 274 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] OptiML: an implicitly parallel domain-specific language for machine learning

A Sujeeth, HJ Lee, K Brown, T Rompf… - Proceedings of the …, 2011 - researchgate.net

As the size of datasets continues to grow, machine learning applications are becoming
increasingly limited by the amount of available computational power. Taking advantage of …

Tallenna Viittaa Viittausten määrä 306 Aiheeseen liittyviä artikkeleita Kaikki 12 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] hw.ac.uk

Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code

M Steuwer, C Fensch, S Lindley, C Dubach - ACM SIGPLAN Notices, 2015 - dl.acm.org

Computers have become increasingly complex with the emergence of heterogeneous
hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous …

Tallenna Viittaa Viittausten määrä 191 Aiheeseen liittyviä artikkeleita Kaikki 19 versiota

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

A heterogeneous parallel framework for domain-specific languages

KJ Brown, AK Sujeeth, HJ Lee, T Rompf… - 2011 International …, 2011 - ieeexplore.ieee.org

Computing systems are becoming increasingly parallel and heterogeneous, and therefore
new applications must be capable of exploiting parallelism in order to continue achieving …

Tallenna Viittaa Viittausten määrä 280 Aiheeseen liittyviä artikkeleita Kaikki 13 versiota

[Free GPT-4]
[DeepSeek]

[PDF] kuleuven.be

Pencil: A platform-neutral compute intermediate language for accelerator programming

R Baghdadi, U Beaugnon, A Cohen… - 2015 International …, 2015 - ieeexplore.ieee.org

Programming accelerators such as GPUs with low-level APIs and languages such as
OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic …

Tallenna Viittaa Viittausten määrä 164 Aiheeseen liittyviä artikkeleita Kaikki 23 versiota

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Codon: A compiler for high-performance pythonic applications and dsls

A Shajii, G Ramirez, H Smajlović, J Ray… - Proceedings of the …, 2023 - dl.acm.org

Domain-specific languages (DSLs) are able to provide intuitive high-level abstractions that
are easy to work with while attaining better performance than general-purpose languages …

Tallenna Viittaa Viittausten määrä 28 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota

[Free GPT-4]
[DeepSeek]

[PDF] berkeley.edu

CudaDMA: optimizing GPU memory bandwidth via warp specialization

M Bauer, H Cook, B Khailany - … of 2011 international conference for high …, 2011 - dl.acm.org

As the computational power of GPUs continues to scale with Moore's Law, an increasing
number of applications are becoming limited by memory bandwidth. We propose an …

Tallenna Viittaa Viittausten määrä 221 Aiheeseen liittyviä artikkeleita Kaikki 15 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dimmwitted: A study of main-memory statistical analytics

C Zhang, C Ré - arxiv preprint arxiv:1403.7550, 2014 - arxiv.org

We perform the first study of the tradeoff space of access methods and replication to support
statistical analytics using first-order methods executed in the main memory of a Non-Uniform …

Tallenna Viittaa Viittausten määrä 169 Aiheeseen liittyviä artikkeleita Kaikki 16 versiota HTML-versio

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

A domain-specific approach to heterogeneous parallelism

Tiramisu: A polyhedral compiler for expressing fast and portable code

Green-Marl: a DSL for easy and efficient graph analysis

Delite: A compiler architecture for performance-oriented embedded domain-specific languages

[PDF][PDF] OptiML: an implicitly parallel domain-specific language for machine learning

Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code

A heterogeneous parallel framework for domain-specific languages

Pencil: A platform-neutral compute intermediate language for accelerator programming

Codon: A compiler for high-performance pythonic applications and dsls

CudaDMA: optimizing GPU memory bandwidth via warp specialization

Dimmwitted: A study of main-memory statistical analytics