Deepvenom: Persistent dnn backdoors exploiting transient weight perturbations in memories
Backdoor attacks have raised significant concerns in machine learning (ML) systems.
Mainstream ML backdoor attacks typically involve either poisoning the victim's training …
Mainstream ML backdoor attacks typically involve either poisoning the victim's training …
vattention: Dynamic memory management for serving llms without pagedattention
Efficient management of GPU memory is essential for high throughput LLM inference. Prior
systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity …
systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity …
{Invalidate+ Compare}: A {Timer-Free}{GPU} Cache Attack Primitive
While extensive research has been conducted on CPU cache side-channel attacks, the
landscape of similar studies on modern GPUs remains largely uncharted. In this paper, we …
landscape of similar studies on modern GPUs remains largely uncharted. In this paper, we …
{GPU} Memory Exploitation for Fun and Profit
As modern applications increasingly rely on GPUs to accelerate the computation, it has
become very critical to study and understand the security implications of GPUs. In this work …
become very critical to study and understand the security implications of GPUs. In this work …
Uncovering Real GPU NoC Characteristics: Implications on Interconnect Architecture
A critical component of high-throughput processors such as GPUs is the network-on-chip
(NoC) that interconnects the large number of cores and the memory partitions together. In …
(NoC) that interconnects the large number of cores and the memory partitions together. In …
Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions
Q Wang, D Oswald - arxiv preprint arxiv:2408.11601, 2024 - arxiv.org
In recent years, the widespread informatization and rapid data explosion have increased the
demand for high-performance heterogeneous systems that integrate multiple computing …
demand for high-performance heterogeneous systems that integrate multiple computing …
XpuTEE: A High-Performance and Practical Heterogeneous Trusted Execution Environment for GPUs
AI applications are employed in diverse scenarios, including data centers, personal
computers, smart cars, and so on. Their privacy is threatened by the intricate software stacks …
computers, smart cars, and so on. Their privacy is threatened by the intricate software stacks …
Improving multi-instance GPU efficiency via sub-entry sharing TLB design
NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power
and memory into separate hardware instances, providing complete isolation including …
and memory into separate hardware instances, providing complete isolation including …
Guardian: Safe GPU Sharing in Multi-Tenant Environments
M Pavlidakis, G Vasiliadis, S Mavridis… - Proceedings of the 25th …, 2024 - dl.acm.org
Modern GPU applications, such as machine learning (ML), can only partially utilize GPUs,
leading to GPU underutilization in cloud environments. Sharing GPUs across multiple …
leading to GPU underutilization in cloud environments. Sharing GPUs across multiple …
Veiled Pathways: Investigating Covert and Side Channels Within GPU Uncore
With the emergence of GPUs as first-class compute engines, more concentrated focus has
been put into covert and side channel discovery in these architectures. However, most of the …
been put into covert and side channel discovery in these architectures. However, most of the …