Software fault tolerance in real-time systems: Identifying the future research questions

F Reghenzani, Z Guo, W Fornaciari - ACM Computing Surveys, 2023 - dl.acm.org
Tolerating hardware faults in modern architectures is becoming a prominent problem due to
the miniaturization of the hardware components, their increasing complexity, and the …

A survey of fault-tolerance techniques for embedded systems from the perspective of power, energy, and thermal issues

S Safari, M Ansari, H Khdr, P Gohari-Nazari… - IEEE …, 2022 - ieeexplore.ieee.org
The relentless technology scaling has provided a significant increase in processor
performance, but on the other hand, it has led to adverse impacts on system reliability. In …

The interplay of power management and fault recovery in real-time systems

R Melhem, D Mosse, E Elnozahy - IEEE Transactions on …, 2004 - ieeexplore.ieee.org
We describe how to exploit the scheduling slack in a real-time system to reduce energy
consumption and achieve fault tolerance at the same time. During failure-free operation, a …

Design optimization of time-and cost-constrained fault-tolerant embedded systems with checkpointing and replication

P Pop, V Izosimov, P Eles… - IEEE Transactions on Very …, 2009 - ieeexplore.ieee.org
We present an approach to the synthesis of fault-tolerant hard real-time systems for safety-
critical applications. We use checkpointing with rollback recovery and active replication for …

A unified approach for fault tolerance and dynamic power management in fixed-priority real-time embedded systems

Y Zhang, K Chakrabarty - IEEE Transactions on Computer …, 2005 - ieeexplore.ieee.org
This paper investigates an integrated approach for achieving fault tolerance and energy
savings in real-time embedded systems. Fault tolerance is achieved via checkpointing, and …

Cyber-physical system checkpointing and recovery

F Kong, M Xu, J Weimer, O Sokolsky… - 2018 ACM/IEEE 9th …, 2018 - ieeexplore.ieee.org
Transitioning to more open architectures has been making Cyber-Physical Systems (CPS)
vulnerable to malicious attacks that are beyond the conventional cyber attacks. This paper …

Two-state checkpointing for energy-efficient fault tolerance in hard real-time systems

M Salehi, MK Tavana, S Rehman… - … Transactions on Very …, 2016 - ieeexplore.ieee.org
Checkpointing with rollback recovery is a well-established technique to tolerate transient
faults. However, it incurs significant time and energy overheads, which go wasted in fault …

Robust mixed-criticality systems

A Burns, RI Davis, S Baruah… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Certification authorities require correctness and survivability. In the temporal domain this
requires a convincing argument that all deadlines will be met under error free conditions …

LEC-MiCs: Low-Energy Checkpointing in Mixed-Criticality Multicore Systems

S Safari, S Shokri, S Hessabi… - ACM Transactions on …, 2025 - dl.acm.org
With the advent of multicore platforms in designing Mixed-Criticality Systems (MCSs),
simultaneous management of reliability and energy while guaranteeing an acceptable …

Fault-tolerant and real-time scheduling for mixed-criticality systems

RM Pathan - Real-Time Systems, 2014 - Springer
The design and analysis of real-time scheduling algorithms for safety-critical systems is a
challenging problem due to the temporal dependencies among different design constraints …