- Academic Search

J Perez-Cerrolaza, J Abella, L Kosmidis… - ACM Computing …, 2022 - dl.acm.org

Graphics Processing Unit (GPU) devices and their associated software programming
languages and frameworks can deliver the computing performance required to facilitate the …

Simpan Kutip Dirujuk 32 kali Artikel terkait 7 versi

[Free GPT-4]

[PDF] ieee.org

The sparse polyhedral framework: Composing compiler-generated inspector-executor code

MM Strout, M Hall, C Olschanowsky - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …

Simpan Kutip Dirujuk 90 kali Artikel terkait 3 versi

[Free GPT-4]

[PDF] usenix.org

Rammer: Enabling holistic deep learning compiler optimizations with {rTasks}

L Ma, Z **e, Z Yang, J Xue, Y Miao, W Cui… - … USENIX Symposium on …, 2020 - usenix.org

Performing Deep Neural Network (DNN) computation on hardware accelerators efficiently is
challenging. Existing DNN frameworks and compilers often treat the DNN operators in a …

Simpan Kutip Dirujuk 163 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]

[PDF] nsf.gov

Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks

S Ghodrati, BH Ahn, JK Kim, S Kinzer… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org

Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …

Simpan Kutip Dirujuk 132 kali Artikel terkait 9 versi

[Free GPT-4]

[PDF] sjtu.edu.cn

Simultaneous multikernel GPU: Multi-tasking throughput processors via fine-grained sharing

Z Wang, J Yang, R Melhem, B Childers… - … symposium on high …, 2016 - ieeexplore.ieee.org

Studies show that non-graphics programs can be less optimized for the GPU hardware,
leading to significant resource under-utilization. Sharing the GPU among multiple programs …

Simpan Kutip Dirujuk 201 kali Artikel terkait 6 versi

[Free GPT-4]

[PDF] github.io

Heimdall: mobile GPU coordination platform for augmented reality applications

J Yi, Y Lee - Proceedings of the 26th Annual International …, 2020 - dl.acm.org

We present Heimdall, a mobile GPU coordination platform for emerging Augmented Reality
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …

Simpan Kutip Dirujuk 79 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] acm.org

Effisha: A software framework for enabling effficient preemptive scheduling of gpu

G Chen, Y Zhao, X Shen, H Zhou - … on Principles and Practice of Parallel …, 2017 - dl.acm.org

Modern GPUs are broadly adopted in many multitasking environments, including data
centers and smartphones. However, the current support for the scheduling of multiple GPU …

Simpan Kutip Dirujuk 117 kali Artikel terkait 5 versi

[Free GPT-4]

[PDF] nsf.gov

Hardware compute partitioning on NVIDIA GPUs

J Bakita, JH Anderson - 2023 IEEE 29th Real-Time and …, 2023 - ieeexplore.ieee.org

Embedded and autonomous systems are increasingly integrating AI/ML features, often
enabled by a hardware accelerator such as a GPU. As these workloads become …

Simpan Kutip Dirujuk 24 kali Artikel terkait 6 versi

[Free GPT-4]

[PDF] acm.org

Dynamic resource management for efficient utilization of multitasking GPUs

JJK Park, Y Park, S Mahlke - Proceedings of the twenty-second …, 2017 - dl.acm.org

As graphics processing units (GPUs) are broadly adopted, running multiple applications on
a GPU at the same time is beginning to attract wide attention. Recent proposals on …

Simpan Kutip Dirujuk 99 kali Artikel terkait 9 versi

[Free GPT-4]

[PDF] tu-dresden.de

Locality-aware CTA clustering for modern GPUs

A Li, SL Song, W Liu, X Liu, A Kumar… - ACM SIGARCH …, 2017 - dl.acm.org

Cache is designed to exploit locality; however, the role of on-chip L1 data caches on modern
GPUs is often awkward. The locality among global memory requests from different SMs …

Simpan Kutip Dirujuk 96 kali Artikel terkait 13 versi

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Enabling and exploiting flexible task assignment on GPU through SM-centric program transformations

GPU devices for safety-critical systems: A survey

The sparse polyhedral framework: Composing compiler-generated inspector-executor code

Rammer: Enabling holistic deep learning compiler optimizations with {rTasks}

Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks

Simultaneous multikernel GPU: Multi-tasking throughput processors via fine-grained sharing

Heimdall: mobile GPU coordination platform for augmented reality applications

Effisha: A software framework for enabling effficient preemptive scheduling of gpu

Hardware compute partitioning on NVIDIA GPUs

Dynamic resource management for efficient utilization of multitasking GPUs

Locality-aware CTA clustering for modern GPUs