Tensorfhe: Achieving practical computation on encrypted data using gpgpu

S Fan, Z Wang, W Xu, R Hou, D Meng… - … Symposium on High …, 2023 - ieeexplore.ieee.org
In the cloud computing era, privacy protection is becoming pervasive in a broad range of
applications (eg, machine learning, data mining, etc). Fully Homomorphic Encryption (FHE) …

Accelerating polynomial multiplication for homomorphic encryption on GPUs

K Shivdikar, G Jonatan, E Mora… - … on Secure and …, 2022 - ieeexplore.ieee.org
Homomorphic Encryption (HE) enables users to securely outsource both the storage and
computation of sensitive data to untrusted servers. Not only does HE offer an attractive …

DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication

Y Lu, W Liu - Proceedings of the International Conference for High …, 2023 - dl.acm.org
Sparse matrix-vector multiplication (SpMV) plays a key role in computational science and
engineering, graph processing, and machine learning applications. Much work on SpMV …

Cheddar: A swift fully homomorphic encryption library for cuda gpus

J Kim, W Choi, JH Ahn - arxiv preprint arxiv:2407.13055, 2024 - arxiv.org
Fully homomorphic encryption (FHE) is a cryptographic technology capable of resolving
security and privacy problems in cloud computing by encrypting data in use. However, FHE …

Bind the gap: Compiling real software to hardware FFT accelerators

J Woodruff, J Armengol-Estapé, S Ainsworth… - Proceedings of the 43rd …, 2022 - dl.acm.org
Specialized hardware accelerators continue to be a source of performance improvement.
However, such specialization comes at a programming price. The fundamental issue is that …

SIMD2 a generalized matrix instruction set for accelerating tensor computation beyond GEMM

Y Zhang, PA Tsai, HW Tseng - Proceedings of the 49th Annual …, 2022 - dl.acm.org
Matrix-multiplication units (MXUs) are now prevalent in every computing platform. The key
attribute that makes MXUs so successful is the semiring structure, which allows tiling for both …

Collaborative Acceleration for FFT on Commercial Processing-In-Memory Architectures

MA Ibrahim, S Aga - arxiv preprint arxiv:2308.03973, 2023 - arxiv.org
This paper evaluates the efficacy of recent commercial processing-in-memory (PIM)
solutions to accelerate fast Fourier transform (FFT), an important primitive across several …

HEGrid: A high efficient multi-channel radio astronomical data gridding framework in heterogeneous computing environments

H Wang, C Yu, J **ao, S Tang, M Long… - Future Generation …, 2023 - Elsevier
The challenge to fully exploit the potential of existing and upcoming scientific instruments
like large single-dish radio telescopes is to process the collected massive data effectively …

MAD MAcce: Supporting Multiply-Add Operations for Democratizing Matrix-Multiplication Accelerators

S Sung, S Hur, S Kim, D Ha, Y Oh, WW Ro - Proceedings of the 56th …, 2023 - dl.acm.org
Modern GPUs commonly employ specialized matrix multiplication units (MXUs) to
accelerate matrix multiplication, the core computation of deep learning workloads. However …

A survey of software implementations for the number theoretic transform

AC Mert, F Yaman, E Karabulut, E Öztürk… - … on Embedded Computer …, 2023 - Springer
This survey summarizes the software implementation knowledge of the Number Theoretic
Transform (NTT)—a major subroutine of lattice-based cryptosystems. The NTT is a special …