Graviton: Trusted execution environments on {GPUs}
We propose Graviton, an architecture for supporting trusted execution environments on
GPUs. Graviton enables applications to offload security-and performance-sensitive kernels …
GPUs. Graviton enables applications to offload security-and performance-sensitive kernels …
Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …
learning patterns of data and are permeating into different industries and markets. Cloud …
Prophet: Precise qos prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers
Guaranteeing Quality-of-Service (QoS) of latency-sensitive applications while improving
server utilization through application co-location is important yet challenging in modern …
server utilization through application co-location is important yet challenging in modern …
Heimdall: mobile GPU coordination platform for augmented reality applications
We present Heimdall, a mobile GPU coordination platform for emerging Augmented Reality
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …
(AR) applications. Future AR apps impose an explored challenging workload: i) concurrent …
Warped-slicer: Efficient intra-SM slicing through dynamic resource partitioning for GPU multiprogramming
As technology scales, GPUs are forecasted to incorporate an ever-increasing amount of
computing resources to support thread-level parallelism. But even with the best effort …
computing resources to support thread-level parallelism. But even with the best effort …
Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …
Effisha: A software framework for enabling effficient preemptive scheduling of gpu
Modern GPUs are broadly adopted in many multitasking environments, including data
centers and smartphones. However, the current support for the scheduling of multiple GPU …
centers and smartphones. However, the current support for the scheduling of multiple GPU …
Dynamic resource management for efficient utilization of multitasking GPUs
As graphics processing units (GPUs) are broadly adopted, running multiple applications on
a GPU at the same time is beginning to attract wide attention. Recent proposals on …
a GPU at the same time is beginning to attract wide attention. Recent proposals on …
{G-NET}: Effective {GPU} Sharing in {NFV} Systems
Network Function Virtualization (NFV) virtualizes software network functions to offer flexibility
in their design, management and deployment. Although GPUs have demonstrated their …
in their design, management and deployment. Although GPUs have demonstrated their …
Zorua: A holistic approach to resource virtualization in GPUs
This paper introduces a new resource virtualization framework, Zorua, that decouples the
programmer-specified resource usage of a GPU application from the actual allocation in the …
programmer-specified resource usage of a GPU application from the actual allocation in the …