Deepvenom: Persistent dnn backdoors exploiting transient weight perturbations in memories

K Cai, MHI Chowdhuryy, Z Zhang… - 2024 IEEE Symposium …, 2024 - ieeexplore.ieee.org
Backdoor attacks have raised significant concerns in machine learning (ML) systems.
Mainstream ML backdoor attacks typically involve either poisoning the victim's training …

vattention: Dynamic memory management for serving llms without pagedattention

R Prabhu, A Nayak, J Mohan, R Ramjee… - arxiv preprint arxiv …, 2024 - arxiv.org
Efficient management of GPU memory is essential for high throughput LLM inference. Prior
systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity …

{Invalidate+ Compare}: A {Timer-Free}{GPU} Cache Attack Primitive

Z Zhang, K Cai, Y Guo, F Yao, X Gao - 33rd USENIX Security …, 2024 - usenix.org
While extensive research has been conducted on CPU cache side-channel attacks, the
landscape of similar studies on modern GPUs remains largely uncharted. In this paper, we …

{GPU} Memory Exploitation for Fun and Profit

Y Guo, Z Zhang, J Yang - 33rd USENIX Security Symposium (USENIX …, 2024 - usenix.org
As modern applications increasingly rely on GPUs to accelerate the computation, it has
become very critical to study and understand the security implications of GPUs. In this work …

Uncovering Real GPU NoC Characteristics: Implications on Interconnect Architecture

Z **, C Rocca, J Kim, H Kasan, M Rhu… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
A critical component of high-throughput processors such as GPUs is the network-on-chip
(NoC) that interconnects the large number of cores and the memory partitions together. In …

Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions

Q Wang, D Oswald - arxiv preprint arxiv:2408.11601, 2024 - arxiv.org
In recent years, the widespread informatization and rapid data explosion have increased the
demand for high-performance heterogeneous systems that integrate multiple computing …

XpuTEE: A High-Performance and Practical Heterogeneous Trusted Execution Environment for GPUs

S Fan, Z Hua, Y **a, H Chen - ACM Transactions on Computer Systems, 2025 - dl.acm.org
AI applications are employed in diverse scenarios, including data centers, personal
computers, smart cars, and so on. Their privacy is threatened by the intricate software stacks …

Improving multi-instance GPU efficiency via sub-entry sharing TLB design

B Li, Y Wang, T Wang, L Eeckhout, J Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power
and memory into separate hardware instances, providing complete isolation including …

Guardian: Safe GPU Sharing in Multi-Tenant Environments

M Pavlidakis, G Vasiliadis, S Mavridis… - Proceedings of the 25th …, 2024 - dl.acm.org
Modern GPU applications, such as machine learning (ML), can only partially utilize GPUs,
leading to GPU underutilization in cloud environments. Sharing GPUs across multiple …

Veiled Pathways: Investigating Covert and Side Channels Within GPU Uncore

Y Miao, Y Zhang, D Wu, D Zhang, G Tan… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
With the emergence of GPUs as first-class compute engines, more concentrated focus has
been put into covert and side channel discovery in these architectures. However, most of the …