A survey of CPU-GPU heterogeneous computing techniques
As both CPUs and GPUs become employed in a wide range of applications, it has been
acknowledged that both of these Processing Units (PUs) have their unique features and …
acknowledged that both of these Processing Units (PUs) have their unique features and …
A heuristic clustering-based task deployment approach for load balancing using Bayes theorem in cloud environment
Aiming at the current problems that most physical hosts in the cloud data center are so
overloaded that it makes the whole cloud data center'load imbalanced and that existing load …
overloaded that it makes the whole cloud data center'load imbalanced and that existing load …
PVFMM: A parallel kernel independent FMM for particle and volume potentials
We describe our implementation of a parallel fast multipole method for evaluating potentials
for discrete and continuous source distributions. The first requires summation over the …
for discrete and continuous source distributions. The first requires summation over the …
A comparison of binarization methods for historical archive documents
J He, QDM Do, AC Downton… - … Conference on Document …, 2005 - ieeexplore.ieee.org
This paper compares several alternative binarization algorithms for historical archive
documents, by evaluating their effect on end-to-end word recognition performance in a …
documents, by evaluating their effect on end-to-end word recognition performance in a …
An FMM based on dual tree traversal for many-core architectures
R Yokota - Journal of Algorithms & Computational …, 2013 - journals.sagepub.com
The present work attempts to integrate the independent efforts in the fast N-body community
to create the fastest N-body library for many-core and heterogenous architectures. Focus is …
to create the fastest N-body library for many-core and heterogenous architectures. Focus is …
Suitability analysis of FPGAs for heterogeneous platforms in HPC
FA Escobar, X Chang… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
High performance computing (HPC) systems currently integrate several resources such as
multi-cores (CPUs), graphic processing units (GPUs) and reconfigurable logic devices, like …
multi-cores (CPUs), graphic processing units (GPUs) and reconfigurable logic devices, like …
A peta-scalable CPU-GPU algorithm for global atmospheric simulations
Develo** highly scalable algorithms for global atmospheric modeling is becoming
increasingly important as scientists inquire to understand behaviors of the global …
increasingly important as scientists inquire to understand behaviors of the global …
Task‐based FMM for heterogeneous architectures
High performance fast multipole method is crucial for the numerical simulation of many
physical problems. In a previous study, we have shown that task‐based fast multipole …
physical problems. In a previous study, we have shown that task‐based fast multipole …
Enabling and scaling a global shallow-water atmospheric model on Tianhe-2
This paper presents a hybrid algorithm for the petascale global simulation of atmospheric
dynamics on Tianhe-2, the world's current top-ranked supercomputer developed by China's …
dynamics on Tianhe-2, the world's current top-ranked supercomputer developed by China's …
Algorithm 967: A distributed-memory fast multipole method for volume potentials
The solution of a constant-coefficient elliptic Partial Differential Equation (PDE) can be
computed using an integral transform: A convolution with the fundamental solution of the …
computed using an integral transform: A convolution with the fundamental solution of the …