GPU devices for safety-critical systems: A survey

J Perez-Cerrolaza, J Abella, L Kosmidis… - ACM Computing …, 2022 - dl.acm.org
Graphics Processing Unit (GPU) devices and their associated software programming
languages and frameworks can deliver the computing performance required to facilitate the …

The sparse polyhedral framework: Composing compiler-generated inspector-executor code

MM Strout, M Hall, C Olschanowsky - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org
Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …

Rammer: Enabling holistic deep learning compiler optimizations with {rTasks}

L Ma, Z **e, Z Yang, J Xue, Y Miao, W Cui… - … USENIX Symposium on …, 2020 - usenix.org
Performing Deep Neural Network (DNN) computation on hardware accelerators efficiently is
challenging. Existing DNN frameworks and compilers often treat the DNN operators in a …

Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks

S Ghodrati, BH Ahn, JK Kim, S Kinzer… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …

Simultaneous multikernel GPU: Multi-tasking throughput processors via fine-grained sharing

Z Wang, J Yang, R Melhem, B Childers… - … symposium on high …, 2016 - ieeexplore.ieee.org
Studies show that non-graphics programs can be less optimized for the GPU hardware,
leading to significant resource under-utilization. Sharing the GPU among multiple programs …

Heimdall: mobile GPU coordination platform for augmented reality applications

J Yi, Y Lee - Proceedings of the 26th Annual International …, 2020 - dl.acm.org
We present Heimdall, a mobile GPU coordination platform for emerging Augmented Reality
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …

Effisha: A software framework for enabling effficient preemptive scheduling of gpu

G Chen, Y Zhao, X Shen, H Zhou - … on Principles and Practice of Parallel …, 2017 - dl.acm.org
Modern GPUs are broadly adopted in many multitasking environments, including data
centers and smartphones. However, the current support for the scheduling of multiple GPU …

Hardware compute partitioning on NVIDIA GPUs

J Bakita, JH Anderson - 2023 IEEE 29th Real-Time and …, 2023 - ieeexplore.ieee.org
Embedded and autonomous systems are increasingly integrating AI/ML features, often
enabled by a hardware accelerator such as a GPU. As these workloads become …

Dynamic resource management for efficient utilization of multitasking GPUs

JJK Park, Y Park, S Mahlke - Proceedings of the twenty-second …, 2017 - dl.acm.org
As graphics processing units (GPUs) are broadly adopted, running multiple applications on
a GPU at the same time is beginning to attract wide attention. Recent proposals on …

Locality-aware CTA clustering for modern GPUs

A Li, SL Song, W Liu, X Liu, A Kumar… - ACM SIGARCH …, 2017 - dl.acm.org
Cache is designed to exploit locality; however, the role of on-chip L1 data caches on modern
GPUs is often awkward. The locality among global memory requests from different SMs …