GPU scheduling on the NVIDIA TX2: Hidden details revealed

T Amert, N Otterness, M Yang… - 2017 IEEE Real …, 2017 - ieeexplore.ieee.org
The push towards fielding autonomous-driving capabilities in vehicles is happening at
breakneck speed. Semi-autonomous features are becoming increasingly common, and fully …

Enabling preemptive multiprogramming on GPUs

I Tanasic, I Gelado, J Cabezas, A Ramirez… - ACM SIGARCH …, 2014 - dl.acm.org
GPUs are being increasingly adopted as compute accelerators in many domains, spanning
environments from mobile systems to cloud computing. These systems are usually running …

Prophet: Precise qos prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers

Q Chen, H Yang, M Guo, RS Kannan, J Mars… - Proceedings of the …, 2017 - dl.acm.org
Guaranteeing Quality-of-Service (QoS) of latency-sensitive applications while improving
server utilization through application co-location is important yet challenging in modern …

An evaluation of the NVIDIA TX1 for supporting real-time computer-vision workloads

N Otterness, M Yang, S Rust, E Park… - 2017 IEEE Real …, 2017 - ieeexplore.ieee.org
Autonomous vehicles are an exemplar for forward-looking safety-critical real-time systems
where significant computing capacity must be provided within strict size, weight, and power …

Paella: Low-latency model serving with software-defined gpu scheduling

KKW Ng, HM Demoulin, V Liu - Proceedings of the 29th Symposium on …, 2023 - dl.acm.org
Model serving systems play a critical role in multiplexing machine learning inference jobs
across shared GPU infrastructure. These systems have traditionally sat at a high level of …

A survey on techniques for cooperative CPU-GPU computing

K Raju, NN Chiplunkar - Sustainable Computing: Informatics and Systems, 2018 - Elsevier
Abstract Graphical Processing Unit provides massive parallelism due to the presence of
hundreds of cores. Usage of GPUs for general purpose computation (GPGPU) has resulted …

Warped-slicer: Efficient intra-SM slicing through dynamic resource partitioning for GPU multiprogramming

Q Xu, H Jeon, K Kim, WW Ro… - ACM SIGARCH Computer …, 2016 - dl.acm.org
As technology scales, GPUs are forecasted to incorporate an ever-increasing amount of
computing resources to support thread-level parallelism. But even with the best effort …

Deadline-based scheduling for GPU with preemption support

N Capodieci, R Cavicchioli, M Bertogna… - 2018 IEEE Real …, 2018 - ieeexplore.ieee.org
Modern automotive-grade embedded computing platforms feature high-performance
Graphics Processing Units (GPUs) to support the massively parallel processing power …

[PDF][PDF] Avoiding pitfalls when using NVIDIA GPUs for real-time tasks in autonomous systems

M Yang - Proceedings of the 30th Euromicro Conference on …, 2018 - par.nsf.gov
A fundamental shift is resha** how real-time analysis is applied in all forms of
autonomous 45 systems (eg, UAVs, robotics, and, especially, self-driving automobiles) …

Effisha: A software framework for enabling effficient preemptive scheduling of gpu

G Chen, Y Zhao, X Shen, H Zhou - … on Principles and Practice of Parallel …, 2017 - dl.acm.org
Modern GPUs are broadly adopted in many multitasking environments, including data
centers and smartphones. However, the current support for the scheduling of multiple GPU …