Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

Over 100x faster bootstrap** in fully homomorphic encryption through memory-centric optimization with GPUs

W Jung, S Kim, JH Ahn… - IACR Transactions on …, 2021 - philosophymindscience.org
Fully Homomorphic encryption (FHE) has been gaining in popularity as an emerging means
of enabling an unlimited number of operations in an encrypted message without decryption …

Parallel programming models for heterogeneous many-cores: a comprehensive survey

J Fang, C Huang, T Tang, Z Wang - CCF Transactions on High …, 2020 - Springer
Heterogeneous many-cores are now an integral part of modern computing systems ranging
from embedding systems to supercomputers. While heterogeneous many-core design offers …

Benchmarking 6dof outdoor visual localization in changing conditions

T Sattler, W Maddern, C Toft, A Torii… - Proceedings of the …, 2018 - openaccess.thecvf.com
Visual localization enables autonomous vehicles to navigate in their surroundings and
augmented reality applications to link virtual to real worlds. Practical visual localization …

Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU

VW Lee, C Kim, J Chhugani, M Deisher, D Kim… - Proceedings of the 37th …, 2010 - dl.acm.org
Recent advances in computing have led to an explosion in the amount of data being
generated. Processing the ever-growing data in a timely manner has made throughput …

Fifty years of artificial reverberation

V Valimaki, JD Parker, L Savioja… - IEEE Transactions on …, 2012 - ieeexplore.ieee.org
The first artificial reverberation algorithms were proposed in the early 1960s, and new,
improved algorithms are published regularly. These algorithms have been widely used in …

Ultra-fast FFT protein docking on graphics processors

DW Ritchie, V Venkatraman - Bioinformatics, 2010 - academic.oup.com
Motivation: Modelling protein–protein interactions (PPIs) is an increasingly important aspect
of structural bioinformatics. However, predicting PPIs using in silico docking techniques is …

Gzkp: A gpu accelerated zero-knowledge proof system

W Ma, Q **ong, X Shi, X Ma, H **, H Kuang… - Proceedings of the 28th …, 2023 - dl.acm.org
Zero-knowledge proof (ZKP) is a cryptographic protocol that allows one party to prove the
correctness of a statement to another party without revealing any information beyond the …

State-of-the-art in heterogeneous computing

AR Brodtkorb, C Dyken, TR Hagen… - Scientific …, 2010 - content.iospress.com
Node level heterogeneous architectures have become attractive during the last decade for
several reasons: compared to traditional symmetric CPUs, they offer high peak performance …

Inter-block GPU communication via fast barrier synchronization

S **ao, W Feng - … IEEE International Symposium on Parallel & …, 2010 - ieeexplore.ieee.org
While GPGPU stands for general-purpose computation on graphics processing units, the
lack of explicit support for inter-block communication on the GPU arguably hampers its …