Applying lightweight soft error mitigation techniques to embedded mixed precision deep neural networks

G Abich, J Gava, R Garibotti, R Reis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks (DNNs) are being incorporated in resource-constrained IoT devices,
which typically rely on reduced memory footprint and low-performance processors. While …

HAFT: Hardware-assisted fault tolerance

D Kuvaiskii, R Faqeh, P Bhatotia, P Felber… - Proceedings of the …, 2016 - dl.acm.org
Transient hardware faults during the execution of a program can cause data corruptions. We
present HAFT, a fault tolerance technique using hardware extensions of commodity CPUs to …

Lightweight checkpoint technique for resilience against soft errors

M Didehban, SRD Lokam, A Shrivastava - US Patent 10,997,027, 2021 - Google Patents
Systems and methods for implementing a lightweight check point technique for resilience
against soft errors are dis closed. The technique provides effective, safe, and timely soft error …

Towards dynamic dependable systems through evidence-based continuous certification

R Faqeh, C Fetzer, H Hermanns, J Hoffmann… - … Applications of Formal …, 2020 - Springer
Future cyber-physical systems are expected to be dynamic, evolving while already being
deployed. Frequent updates of software components are likely to become the norm even for …

Dataflow model–based software synthesis framework for parallel and distributed embedded systems

E Jeong, D Jeong, S Ha - ACM Transactions on Design Automation of …, 2021 - dl.acm.org
Existing software development methodologies mostly assume that an application runs on a
single device without concern about the non-functional requirements of an embedded …

Asymmetric resilience: Exploiting task-level idempotency for transient error recovery in accelerator-based systems

J Leng, A Buyuktosunoglu, R Bertran… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Accelerators make the task of building systems that are re-silient against transient errors like
voltage noise and soft errors hard. Architects integrate accelerators into the system as black …

NEMESIS: A software approach for computing in presence of soft errors

M Didehban, A Shrivastava… - 2017 IEEE/ACM …, 2017 - ieeexplore.ieee.org
Soft errors are considered as the main reliability challenge for sub-nanoscale
microprocessors. Software-level soft error resilience schemes are desirable because they …

Elzar: Triple modular redundancy using intel avx (practical experience report)

D Kuvaiskii, O Oleksenko, P Bhatotia… - 2016 46th Annual …, 2016 - ieeexplore.ieee.org
Instruction-Level Redundancy (ILR) is a well-known approach to tolerate transient CPU
faults. It replicates instructions in a program and inserts periodic checks to detect and correct …

SIMD-based soft error detection

Z Chen, A Nicolau, AV Veidenbaum - Proceedings of the ACM …, 2016 - dl.acm.org
Soft error rates in processors have been increasing with decreasing feature size and larger
chips. Software-only solutions have been proposed to deal with this problem, for instance …

Composition of component models-a key to construct big systems

W Reisig - International Symposium on Leveraging Applications of …, 2020 - Springer
Modern informatics based systems are mostly composed from self-contained components.
To be useful for really big systems, composed of many components, proper abstraction …