Gme: Gpu-based microarchitectural extensions to accelerate homomorphic encryption
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without
decrypting it. FHE has garnered significant attention over the past decade as it supports …
decrypting it. FHE has garnered significant attention over the past decade as it supports …
CPElide: Efficient Multi-Chiplet GPU Implicit Synchronization
Chiplets are transforming computer system designs, allowing system designers to combine
heterogeneous computing resources at unprecedented scales. Breaking larger, mono-lithic …
heterogeneous computing resources at unprecedented scales. Breaking larger, mono-lithic …
Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads
Today, DNNs' high computational complexity and sub-optimal device utilization present a
major roadblock to democratizing DNNs. To reduce the execution time and improve device …
major roadblock to democratizing DNNs. To reduce the execution time and improve device …
[PDF][PDF] Further Closing the GAP: Improving the Accuracy of gem5's GPU Models
V Ramadas, D Kouchekinia… - 6th Young Architects' …, 2024 - pages.cs.wisc.edu
The breakdown in Moore's Law and Dennard Scaling is leading to drastic changes in the
makeup and constitution of computing systems. For example, a single chip integrates 10 …
makeup and constitution of computing systems. For example, a single chip integrates 10 …
ShaderPerFormer: Platform-independent Context-aware Shader Performance Predictor
The ability to model and predict the execution time of GPU computations is crucial for real-
time graphics application development and optimization. While there are many existing …
time graphics application development and optimization. While there are many existing …
T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Large Language Models increasingly rely on distributed techniques for their training and
inference. These techniques require communication across devices which can reduce …
inference. These techniques require communication across devices which can reduce …
[BUCH][B] High Performance Privacy Preserving AI
Artificial intelligence (AI) depends on data. In sensitive domains–such as healthcare,
security, finance, and many more–there is therefore a tension between unleashing the …
security, finance, and many more–there is therefore a tension between unleashing the …
Enhanced System-Level Coherence for Heterogeneous Unified Memory Architectures
AM Nataraja, R Fernández-Pascual… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Heterogeneous Unified Memory Architectures (HUMA) provide a unified memory space for
on-die CPUs, GPUs, and other hardware accelerators. Such architectures improve …
on-die CPUs, GPUs, and other hardware accelerators. Such architectures improve …
A Design Methodology for Producing Highly-Adaptable and High-Performance Simulation Frameworks
Y Bao - 2024 - search.proquest.com
Computer architecture simulators play an essential role in the development and optimization
of computer hardware. A variety of simulators have been developed to explore the design …
of computer hardware. A variety of simulators have been developed to explore the design …
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
YB Kaustubh Shivdikar… - MICRO'23, October 28 …, 2023 - digitum.um.es
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without
decrypting it. FHE has garnered significant attention over the past decade as it supports …
decrypting it. FHE has garnered significant attention over the past decade as it supports …