[HTML][HTML] A taxonomy of task-based parallel programming technologies for high-performance computing
Task-based programming models for shared memory—such as Cilk Plus and OpenMP 3—
are well established and documented. However, with the increase in parallel, many-core …
are well established and documented. However, with the increase in parallel, many-core …
Resiliency in numerical algorithm design for extreme scale simulations
This work is based on the seminar titled 'Resiliency in Numerical Algorithm Design for
Extreme Scale Simulations' held March 1–6, 2020, at Schloss Dagstuhl, that was attended …
Extreme Scale Simulations' held March 1–6, 2020, at Schloss Dagstuhl, that was attended …
Studying error propagation on application data structure and hardware
As technology scales, transistors become smaller and aggressive power optimization
techniques combined with high operation frequencies and performance-enhancing …
techniques combined with high operation frequencies and performance-enhancing …
Exploiting asynchrony from exact forward recovery for due in iterative solvers
This paper presents a method to protect iterative solvers from Detected and Uncorrected
Errors (DUE) relying on error detection techniques already available in commodity …
Errors (DUE) relying on error detection techniques already available in commodity …
Checkpointing workflows for fail-stop errors
We consider the problem of orchestrating the execution of workflow applications structured
as Directed Acyclic Graphs (DAGs) on parallel computing platforms that are subject to fail …
as Directed Acyclic Graphs (DAGs) on parallel computing platforms that are subject to fail …
Unified fault-tolerance framework for hybrid task-parallel message-passing applications
We present a unified fault-tolerance framework for task-parallel message-passing
applications to mitigate transient errors. First, we propose a fault-tolerant message-logging …
applications to mitigate transient errors. First, we propose a fault-tolerant message-logging …
Spatial support vector regression to detect silent errors in the exascale era
As the exascale era approaches, the increasing capacity of high-performance computing
(HPC) systems with targeted power and energy budget goals introduces significant …
(HPC) systems with targeted power and energy budget goals introduces significant …
Vits: video tagging system from massive web multimedia collections
D Fernández, D Varas, J Espadaler… - Proceedings of the …, 2017 - openaccess.thecvf.com
The popularization of multimedia content on the Web has arised the need to automatically
understand, index and retrieve it. In this paper we present ViTS, an automatic Video Tagging …
understand, index and retrieve it. In this paper we present ViTS, an automatic Video Tagging …
Exploring the capabilities of support vector machines in detecting silent data corruptions
As the exascale era approaches, the increasing capacity of high-performance computing
(HPC) systems with targeted power and energy budget goals introduces significant …
(HPC) systems with targeted power and energy budget goals introduces significant …
Enabling resilience in asynchronous many-task programming models
Resilience is an imminent issue for next-generation platforms due to projected increases in
soft/transient failures as part of the inherent trade-offs among performance, energy, and …
soft/transient failures as part of the inherent trade-offs among performance, energy, and …