Gme: Gpu-based microarchitectural extensions to accelerate homomorphic encryption

K Shivdikar, Y Bao, R Agrawal, M Shen… - Proceedings of the 56th …, 2023 - dl.acm.org
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without
decrypting it. FHE has garnered significant attention over the past decade as it supports …

CPElide: Efficient Multi-Chiplet GPU Implicit Synchronization

P Dalmia, RS Kumar, MD Sinclair - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Chiplets are transforming computer system designs, allowing system designers to combine
heterogeneous computing resources at unprecedented scales. Breaking larger, mono-lithic …

Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads

Y Li, Y Sun, A Jog - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Today, DNNs' high computational complexity and sub-optimal device utilization present a
major roadblock to democratizing DNNs. To reduce the execution time and improve device …

[PDF][PDF] Further Closing the GAP: Improving the Accuracy of gem5's GPU Models

V Ramadas, D Kouchekinia… - 6th Young Architects' …, 2024 - pages.cs.wisc.edu
The breakdown in Moore's Law and Dennard Scaling is leading to drastic changes in the
makeup and constitution of computing systems. For example, a single chip integrates 10 …

ShaderPerFormer: Platform-independent Context-aware Shader Performance Predictor

Z Liu, Y Huang, L Liu - Proceedings of the ACM on Computer Graphics …, 2024 - dl.acm.org
The ability to model and predict the execution time of GPU computations is crucial for real-
time graphics application development and optimization. While there are many existing …

T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

S Pati, S Aga, M Islam, N Jayasena… - Proceedings of the 29th …, 2024 - dl.acm.org
Large Language Models increasingly rely on distributed techniques for their training and
inference. These techniques require communication across devices which can reduce …

[BUCH][B] High Performance Privacy Preserving AI

J Shenoy, P Grinaway, S Palakodety - 2024 - nowpublishers.com
Artificial intelligence (AI) depends on data. In sensitive domains–such as healthcare,
security, finance, and many more–there is therefore a tension between unleashing the …

Enhanced System-Level Coherence for Heterogeneous Unified Memory Architectures

AM Nataraja, R Fernández-Pascual… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Heterogeneous Unified Memory Architectures (HUMA) provide a unified memory space for
on-die CPUs, GPUs, and other hardware accelerators. Such architectures improve …

A Design Methodology for Producing Highly-Adaptable and High-Performance Simulation Frameworks

Y Bao - 2024 - search.proquest.com
Computer architecture simulators play an essential role in the development and optimization
of computer hardware. A variety of simulators have been developed to explore the design …

GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption

YB Kaustubh Shivdikar… - MICRO'23, October 28 …, 2023 - digitum.um.es
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without
decrypting it. FHE has garnered significant attention over the past decade as it supports …