GPU virtualization and scheduling methods: A comprehensive survey

CH Hong, I Spence, DS Nikolopoulos - ACM Computing Surveys (CSUR …, 2017 - dl.acm.org
The integration of graphics processing units (GPUs) on high-end compute nodes has
established a new accelerator-based heterogeneous computing model, which now …

Cloud computing landscape and research challenges regarding trust and reputation

SM Habib, S Ries, M Muhlhauser - 2010 7th International …, 2010 - ieeexplore.ieee.org
Cloud Computing is an emerging computing paradigm. It shares massively scalable, elastic
resources (eg, data, calculations, and services) transparently among the users over a …

Graviton: Trusted execution environments on {GPUs}

S Volos, K Vaswani, R Bruno - 13th USENIX Symposium on Operating …, 2018 - usenix.org
We propose Graviton, an architecture for supporting trusted execution environments on
GPUs. Graviton enables applications to offload security-and performance-sensitive kernels …

Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks

S Ghodrati, BH Ahn, JK Kim, S Kinzer… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …

Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units

Y Choi, M Rhu - 2020 IEEE International Symposium on High …, 2020 - ieeexplore.ieee.org
To amortize cost, cloud vendors providing DNN acceleration as a service to end-users
employ consolidation and virtualization to share the underlying resources among multiple …

Simultaneous multikernel GPU: Multi-tasking throughput processors via fine-grained sharing

Z Wang, J Yang, R Melhem, B Childers… - … symposium on high …, 2016 - ieeexplore.ieee.org
Studies show that non-graphics programs can be less optimized for the GPU hardware,
leading to significant resource under-utilization. Sharing the GPU among multiple programs …

Baymax: Qos awareness and increased utilization for non-preemptive accelerators in warehouse scale computers

Q Chen, H Yang, J Mars, L Tang - ACM SIGPLAN Notices, 2016 - dl.acm.org
Modern warehouse-scale computers (WSCs) are being outfitted with accelerators to provide
the significant compute required by emerging intelligent personal assistant (IPA) workloads …

Chimera: Collaborative preemption for multitasking on a shared GPU

JJK Park, Y Park, S Mahlke - ACM SIGARCH Computer Architecture …, 2015 - dl.acm.org
The demand for multitasking on graphics processing units (GPUs) is constantly increasing
as they have become one of the default components on modern computer systems along …

Telekine: Secure computing with cloud {GPUs}

T Hunt, Z Jia, V Miller, A Szekely, Y Hu… - … USENIX Symposium on …, 2020 - usenix.org
GPUs have become ubiquitous in the cloud due to the dramatic performance gains they
enable in domains such as machine learning and computer vision. However, offloading …

Heimdall: mobile GPU coordination platform for augmented reality applications

J Yi, Y Lee - Proceedings of the 26th Annual International …, 2020 - dl.acm.org
We present Heimdall, a mobile GPU coordination platform for emerging Augmented Reality
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …