Miso: exploiting multi-instance gpu capability on multi-tenant gpu clusters

B Li, T Patel, S Samsi, V Gadepally… - Proceedings of the 13th …, 2022 - dl.acm.org
GPU technology has been improving at an expedited pace in terms of size and performance,
empowering HPC and AI/ML researchers to advance the scientific discovery process …

Hardware compute partitioning on NVIDIA GPUs

J Bakita, JH Anderson - 2023 IEEE 29th Real-Time and …, 2023 - ieeexplore.ieee.org
Embedded and autonomous systems are increasingly integrating AI/ML features, often
enabled by a hardware accelerator such as a GPU. As these workloads become …

Making powerful enemies on NVIDIA GPUs

T Yandrofski, J Chen, N Otterness… - 2022 IEEE Real …, 2022 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) are widely used in safety-critical real-time systems such
as autonomous vehicles due to their high performance on artificial intelligence (AI) work …

Towards Efficient Parallel GPU Scheduling: Interference Awareness with Schedule Abstraction

N Feddal, G Lipari, HE Zahaf - … of the 32nd International Conference on …, 2024 - dl.acm.org
GPUs are powerful computing architectures that are increasingly used in embedded
systems for implementing complex intelligent applications. Unfortunately, it is difficult to …

Optimizing GPU Multiplexing for Efficient and Cost-Effective Access to Diverse Large Language Models in GPU Clusters

Y Zhu, C Wang, M Calman… - … on Modeling, Analysis …, 2024 - ieeexplore.ieee.org
Large Language Models (LLMs) are a cornerstone of modern artificial intelligence research,
gaining popularity and encouraging adoption in varying domains. The burgeoning interest …

Memory interference and performance prediction in GPU-accelerated heterogeneous systems

A Masola - 2024 - repository.unipr.it
Oggigiorno, una varietà di applicazioni, tra cui fabbriche automatizzate, veicoli autonomi e
Sistemi Cyber Fisici (CPS), stanno vivendo una crescita significativa. Date le diverse sfide …

Selecting Preemption Points for Single Core Energy-Neutralreal-Time Systems

HE Zahaf, PE Hladik, S Faucou, A Queudet - Available at SSRN 4953960 - papers.ssrn.com
In this work, we focus on energy-neutral real-time systems, where ambient energy harvested
in the environment is used to power a device that execute tasks with timing constraints. We …

[PDF][PDF] AI-based Scalable Analytics for Improving Performance and Resilience of HPC Systems

E Sencan, B Oztop, B Schwaller, VJ Leung, J Brandt… - sc24.supercomputing.org
As High-Performance Computing (HPC) advances to exascale levels, its role in scientific
fields such as medicine, climate research, finance, and scientific computing becomes …