DAGuE: A generic distributed DAG engine for high performance computing

G Bosilca, A Bouteiller, A Danalis, T Herault… - Parallel Computing, 2012‏ - Elsevier
The frenetic development of the current architectures places a strain on the current state-of-
the-art programming environments. Harnessing the full potential of such architectures is a …

Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects

E Agullo, J Demmel, J Dongarra, B Hadri… - Journal of Physics …, 2009‏ - iopscience.iop.org
The emergence and continuing use of multi-core architectures and graphics processing
units require changes in the existing software and sometimes even a redesign of the …

[كتاب][B] Communication-avoiding Krylov subspace methods

M Hoemmen - 2010‏ - search.proquest.com
Krylov subspace methods (KSMs) are iterative algorithms for solving large, sparse linear
systems and eigenvalue problems. Current KSMs rely on sparse matrix-vector multiply …

An augmented Lagrangian based algorithm for distributed nonconvex optimization

B Houska, J Frasch, M Diehl - SIAM Journal on Optimization, 2016‏ - SIAM
This paper is about distributed derivative-based algorithms for solving optimization problems
with a separable (potentially nonconvex) objective function and coupled affine constraints. A …

TRACON: Interference-aware scheduling for data-intensive applications in virtualized environments

RC Chiang, HH Huang - … of 2011 International Conference for High …, 2011‏ - dl.acm.org
Large-scale data centers leverage virtualization technology to achieve excellent resource
utilization, scalability, and high availability. Ideally, the performance of an application …

Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA

G Bosilca, A Bouteiller, A Danalis… - … on Parallel and …, 2011‏ - ieeexplore.ieee.org
We present a method for develo** dense linear algebra algorithms that seamlessly scales
to thousands of cores. It can be done with our project called DPLASMA (Distributed …

Real-time big data stream processing using GPU with spark over hadoop ecosystem

MM Rathore, H Son, A Ahmad, A Paul… - International Journal of …, 2018‏ - Springer
In this technological era, every person, authorities, entrepreneurs, businesses, and many
things around us are connected to the internet, forming Internet of thing (IoT). This generates …

Task superscalar: An out-of-order task pipeline

Y Etsion, F Cabarcas, A Rico, A Ramirez… - 2010 43rd Annual …, 2010‏ - ieeexplore.ieee.org
We present\emph {Task Super scalar}, an abstraction of instruction-level out-of-order
pipeline that operates at the task-level. Like ILP pipelines, which uncover parallelism in a …

A real-time fall detection system using a wearable gait analysis sensor and a Support Vector Machine (SVM) classifier

N Shibuya, BT Nukala, AI Rodriguez… - … on mobile computing …, 2015‏ - ieeexplore.ieee.org
In this study, we report a custom designed wireless gait analysis sensor (WGAS) system for
real-time fall detection using a Support Vector Machine (SVM) classifier. Our WGAS includes …

Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems

F Song, S Tomov, J Dongarra - … of the 26th ACM international conference …, 2012‏ - dl.acm.org
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous
multicore and multi-GPU systems to support dense matrix computations efficiently. The main …