Hardware and Software Solutions for Energy‐Efficient Computing in Scientific Programming

D D'Agostino, I Merelli, M Aldinucci… - Scientific …, 2021 - Wiley Online Library
Energy consumption is one of the major issues in today's computer science, and an
increasing number of scientific communities are interested in evaluating the tradeoff …

Silent data errors: Sources, detection, and modeling

A Singh, S Chakravarty, G Papadimitriou… - 2023 IEEE 41st VLSI …, 2023 - ieeexplore.ieee.org
Chip manufacturers and hyperscalers are becoming increasingly aware of the problem
posed by Silent Data Errors (SDE) and are taking steps to address it. Major computing …

An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration

B Salami, EB Onural, IE Yuksel, F Koc… - 2020 50th Annual …, 2020 - ieeexplore.ieee.org
We empirically evaluate an undervolting technique, ie, underscaling the circuit supply
voltage below the nominal level, to improve the power-efficiency of Convolutional Neural …

Impact of voltage scaling on soft errors susceptibility of multicore server cpus

D Agiakatsikas, G Papadimitriou, V Karakostas… - Proceedings of the 56th …, 2023 - dl.acm.org
Microprocessor power consumption and dependability are both crucial challenges that
designers have to cope with due to shrinking feature sizes and increasing transistor counts …

Silent data corruptions: The stealthy saboteurs of digital integrity

G Papadimitriou, D Gizopoulos… - 2023 IEEE 29th …, 2023 - ieeexplore.ieee.org
Silent Data Corruptions (SDCs) pose a significant threat to the integrity of digital systems.
These stealthy saboteurs silently corrupt data, remaining undetected by traditional error …

Enhancing Reliability in Embedded Systems Hardware: A Literature Survey

R Aalund, VP Paglioni - IEEE Access, 2025 - ieeexplore.ieee.org
Embedded Systems are used in extreme conditions, often for long lifespans; as such,
ensuring hardware reliability is essential. Additionally, the applications of embedded …

Understanding power consumption and reliability of high-bandwidth memory with voltage underscaling

SSN Larimi, B Salami, OS Unsal… - … , Automation & Test …, 2021 - ieeexplore.ieee.org
Modern computing devices employ High-Bandwidth Memory (HBM) to meet their memory
bandwidth requirements. An HBM-enabled device consists of multiple DRAM layers stacked …

Estimating the failures and silent errors rates of cpus across isas and microarchitectures

D Gizopoulos, G Papadimitriou… - … IEEE International Test …, 2023 - ieeexplore.ieee.org
Silent data corruptions (SDCs) pose a significant challenge to the reliable operation of
modern microprocessors. As the need for enhanced performance and reliability continues to …

BAFT: bubble-aware fault-tolerant framework for distributed DNN training with hybrid parallelism

R Chen, G Lu, Y Wang, R Zhang, Z Hu, Y Miao… - Frontiers of Computer …, 2025 - Springer
As deep neural networks (DNNs) have been successfully adopted in various domains, the
training of these large-scale models becomes increasingly difficult and is often deployed on …

Silent data corruptions in computing systems: Early predictions and large-scale measurements

D Gizopoulos, G Papadimitriou… - 2024 IEEE European …, 2024 - ieeexplore.ieee.org
Silent Data Corruptions (SDCs) due to defects in computing chips (CPUs, GPUs, AI
accelerators) is a critical threat to the quality of large-scale computing in different application …