GPU devices for safety-critical systems: A survey

J Perez-Cerrolaza, J Abella, L Kosmidis… - ACM Computing …, 2022 - dl.acm.org
Graphics Processing Unit (GPU) devices and their associated software programming
languages and frameworks can deliver the computing performance required to facilitate the …

Revealing gpus vulnerabilities by combining register-transfer and software-level fault injection

FF dos Santos, JER Condia, L Carro… - 2021 51st Annual …, 2021 - ieeexplore.ieee.org
The complexity of both hardware and software makes GPUs reliability evaluation extremely
challenging. A low level fault injection on a GPU model, despite being accurate, would take …

Analyzing DUE errors on GPUs with neutron irradiation test and fault injection to control flow

K Ito, Y Zhang, H Itsuji, T Uezono… - … on Nuclear Science, 2021 - ieeexplore.ieee.org
As GPU applications expand, the reliability of GPU is drawing more attention since even
reliability-demanding applications are executed on GPUs. Silent data corruption (SDC) is …
X Wei, N Jiang, H Yue, X Wang, J Zhao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Nowadays, selective instruction duplication (SelDup) is the typical approach to detect silent
data corruption (SDC) in GPGPU. However, owing to the up-to-billions fault sites of parallel …

Characterizing and exploiting soft error vulnerability phase behavior in gpu applications

F Previlon, C Kalra, D Tiwari… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
System reliability has become a first-class design constraint. As the use of Graphics
Processing Units (GPU) continues to increase in compute applications, including High …

G-SEAP: Analyzing and characterizing soft-error aware approximation in GPGPUs

X Wei, H Yue, S Gao, L Li, R Zhang, J Tan - Future Generation Computer …, 2020 - Elsevier
Abstract As General-Purpose Graphics Processing Units (GPGPUs) become pervasive for
the High-Performance Computing (HPC), ensuring that programs can be protected from soft …

A comprehensive evaluation of the effects of input data on the resilience of GPU applications

FG Previlon, C Kalra, DR Kaeli… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
While GPUs are being aggressively deployed in a growing number of computing domains,
their resilience to transient faults remains a subject of concern. To gain a better …

LAD-ECC: Energy-efficient ECC mechanism for GPGPUs register file

X Wei, H Yue, J Tan - 2020 Design, Automation & Test in …, 2020 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) are widely used in general-purpose high-performance
computing applications (ie, GPGPUs), which require reliable execution in the presence of …

Evaluating the soft error resilience of instructions for gpu applications

X Wei, R Zhang, Y Liu, H Yue… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) are widely used in a range of High Performance
Computing fields because of high parallelism. As the technology scaling down, GPUs are …