DAGuE: A generic distributed DAG engine for high performance computing
The frenetic development of the current architectures places a strain on the current state-of-
the-art programming environments. Harnessing the full potential of such architectures is a …
the-art programming environments. Harnessing the full potential of such architectures is a …
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
The emergence and continuing use of multi-core architectures and graphics processing
units require changes in the existing software and sometimes even a redesign of the …
units require changes in the existing software and sometimes even a redesign of the …
[كتاب][B] Communication-avoiding Krylov subspace methods
M Hoemmen - 2010 - search.proquest.com
Krylov subspace methods (KSMs) are iterative algorithms for solving large, sparse linear
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …
An augmented Lagrangian based algorithm for distributed nonconvex optimization
This paper is about distributed derivative-based algorithms for solving optimization problems
with a separable (potentially nonconvex) objective function and coupled affine constraints. A …
with a separable (potentially nonconvex) objective function and coupled affine constraints. A …
TRACON: Interference-aware scheduling for data-intensive applications in virtualized environments
Large-scale data centers leverage virtualization technology to achieve excellent resource
utilization, scalability, and high availability. Ideally, the performance of an application …
utilization, scalability, and high availability. Ideally, the performance of an application …
Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA
We present a method for develo** dense linear algebra algorithms that seamlessly scales
to thousands of cores. It can be done with our project called DPLASMA (Distributed …
to thousands of cores. It can be done with our project called DPLASMA (Distributed …
Real-time big data stream processing using GPU with spark over hadoop ecosystem
In this technological era, every person, authorities, entrepreneurs, businesses, and many
things around us are connected to the internet, forming Internet of thing (IoT). This generates …
things around us are connected to the internet, forming Internet of thing (IoT). This generates …
Task superscalar: An out-of-order task pipeline
We present\emph {Task Super scalar}, an abstraction of instruction-level out-of-order
pipeline that operates at the task-level. Like ILP pipelines, which uncover parallelism in a …
pipeline that operates at the task-level. Like ILP pipelines, which uncover parallelism in a …
A real-time fall detection system using a wearable gait analysis sensor and a Support Vector Machine (SVM) classifier
In this study, we report a custom designed wireless gait analysis sensor (WGAS) system for
real-time fall detection using a Support Vector Machine (SVM) classifier. Our WGAS includes …
real-time fall detection using a Support Vector Machine (SVM) classifier. Our WGAS includes …
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous
multicore and multi-GPU systems to support dense matrix computations efficiently. The main …
multicore and multi-GPU systems to support dense matrix computations efficiently. The main …