Fault-Tolerant Message-Passing Distributed Systems

M Raynal - An Algorithmic Approach, 2018 - Springer
Fault-Tolerant Message-Passing Distributed Systems Page 1 Michel Raynal Fault-Tolerant
Message-Passing Distributed Systems An Algorithmic Approach Page 2 Fault-Tolerant …

Failure-aware resource management for high-availability computing clusters with distributed virtual machines

S Fu - Journal of Parallel and Distributed Computing, 2010 - Elsevier
In large-scale networked computing systems, component failures become norms instead of
exceptions. Failure-aware resource management is crucial for enhancing system availability …

Fault-tolerant leader election in mobile dynamic distributed systems

C Gómez-Calzado, A Lafuente… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org
This paper addresses the leader election problem in dynamic distributed systems with
mobile processes. To do so, it is assumed that the system alternates periods of good and …

On implementing omega in systems with weak reliability and synchrony assumptions

MK Aguilera, C Delporte-Gallet, H Fauconnier… - Distributed …, 2008 - Springer
We study the feasibility and cost of implementing Ω—a fundamental failure detector at the
core of many algorithms—in systems with weak reliability and synchrony assumptions …

Eventual leader election in evolving mobile networks

L Arantes, F Greve, P Sens, V Simon - Principles of Distributed Systems …, 2013 - Springer
Many reliable distributed services rely on an eventual leader election to coordinate actions.
The eventual leader detector has been proposed as a way to implement such an …

Memory-intensive benchmarks: IRAM vs. cache-based machines

BR Gaeke, P Husbands, XS Li, L Oliker… - Proceedings 16th …, 2002 - ieeexplore.ieee.org
The increasing gap between processor and memory performance has led to new
architectural models for memory-intensive applications. In this paper, we use a set of …

Failure detectors in homonymous distributed systems (with an application to consensus)

S Arévalo, AF Anta, D Imbs, E Jiménez… - Journal of Parallel and …, 2015 - Elsevier
This paper is on homonymous distributed systems where processes are prone to crash
failures and have no initial knowledge of the system membership (“homonymous” means …

Implementing the omega failure detector in the crash-recovery failure model

C Martín, M Larrea, E Jiménez - Journal of Computer and System Sciences, 2009 - Elsevier
Unreliable failure detectors are mechanisms providing information about process failures,
that allow to solve several problems in asynchronous systems, eg, Consensus. A particular …

Eventually strong failure detector with unknown membership

F Greve, P Sens, L Arantes, V Simon - The Computer Journal, 2012 - academic.oup.com
The distributed computing scenario is rapidly evolving for integrating self-organizing and
dynamic wireless networks. Unreliable failure detectors (FDs) are classical mechanisms that …

Never Say Never--Probabilistic and Temporal Failure Detectors

D Dzung, R Guerraoui, D Kozhaya… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
The failure detector approach for solving distributed computing problems has been
celebrated for its modularity. This approach allows the construction of algorithms using …