A comparison of application-level fault tolerance schemes for task pools
J Posner, L Reitz, C Fohry - Future Generation Computer Systems, 2020 - Elsevier
Fault tolerance is an important requirement for successful program execution on exascale
systems. The common approach, checkpointing, regularly saves a program's state, such that …
systems. The common approach, checkpointing, regularly saves a program's state, such that …
A machine learning-based resource-efficient task scheduler for heterogeneous computer systems
Heterogeneous computer systems are becoming mainstream due to their disparate
processing and performance capabilities. These systems consist of different types of …
processing and performance capabilities. These systems consist of different types of …
A Java task pool framework providing fault-tolerant global load balancing
J Posner, C Fohry - International Journal of Networking and …, 2018 - jstage.jst.go.jp
Fault tolerance is gaining importance in parallel computing, especially on large clusters.
Traditional approaches handle the issue on system-level. Application-level approaches are …
Traditional approaches handle the issue on system-level. Application-level approaches are …
[HTML][HTML] Fault tolerance for lifeline-based global load balancing
C Fohry, M Bungart, P Plock - Journal of Software Engineering and …, 2017 - scirp.org
Fault tolerance has become an important issue in parallel computing. It is often addressed at
system level, but application-level approaches receive increasing attention. We consider a …
system level, but application-level approaches receive increasing attention. We consider a …
Towards an efficient fault-tolerance scheme for GLB
C Fohry, M Bungart, J Posner - … of the ACM SIGPLAN Workshop on X10, 2015 - dl.acm.org
X10's Global Load Balancing framework GLB implements a user-level task pool for inter-
place load balancing. It is based on work stealing and deploys the lifeline algorithm. A single …
place load balancing. It is based on work stealing and deploys the lifeline algorithm. A single …
An analysis of the structural, dynamic, and temporal aspects of semantic data models
SD Urban, LML Delcambre - 1986 IEEE Second International …, 1986 - ieeexplore.ieee.org
Semantic data models have been influenced by abstraction techniques used in knowledge
representation. Early semantic models concentrated on the structural aspects of an …
representation. Early semantic models concentrated on the structural aspects of an …
Fault tolerance schemes for global load balancing in X10
C Fohry, M Bungart, J Posner - Scalable Computing: Practice and …, 2015 - scpe.org
Scalability postulates fault tolerance to be efficient. One approach handles permanent node
failures at user level. It is supported by Resilient X10, a Partitioned Global Address Space …
failures at user level. It is supported by Resilient X10, a Partitioned Global Address Space …
A selective and incremental backup scheme for task pools
C Fohry, J Posner, L Reitz - 2018 International Conference on …, 2018 - ieeexplore.ieee.org
Checkpointing is a common approach to prevent loss of a program's state after permanent
node failures. When it is performed on application-level, less data need to be saved. This …
node failures. When it is performed on application-level, less data need to be saved. This …
Fault Tolerance for Cooperative Lifeline-Based Global Load Balancing in Java with APGAS and Hazelcast
J Posner, C Fohry - 2017 IEEE International Parallel and …, 2017 - ieeexplore.ieee.org
Fault tolerance is a major issue for parallel applications. Approaches on application-level
are gaining increasing attention because they may be more efficient than system-level ones …
are gaining increasing attention because they may be more efficient than system-level ones …
A robust fault tolerance scheme for lifeline-based taskpools
C Fohry, M Bungart - 2016 45th International Conference on …, 2016 - ieeexplore.ieee.org
Fault tolerance is of increasing importance for parallel computing. While often addressed at
system level, application-level resilience techniques may be more efficient. In particular, it …
system level, application-level resilience techniques may be more efficient. In particular, it …