Memory coherence in shared virtual memory systems
The memory coherence problem in designing and implementing a shared virtual memory on
loosely coupled multiprocessors is studied in depth. Two classes of algorithms, centralized …
loosely coupled multiprocessors is studied in depth. Two classes of algorithms, centralized …
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Recent advances in computing have led to an explosion in the amount of data being
generated. Processing the ever-growing data in a timely manner has made throughput …
generated. Processing the ever-growing data in a timely manner has made throughput …
MachSuite: Benchmarks for accelerator design and customized architectures
Recent high-level synthesis and accelerator-related architecture papers show a great
disparity in workload selection. To improve standardization within the accelerator research …
disparity in workload selection. To improve standardization within the accelerator research …
GPU-accelerated preconditioned iterative linear solvers
This work is an overview of our preliminary experience in develo** a high-performance
iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the …
iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the …
Automatically tuning sparse matrix-vector multiplication for GPU architectures
Graphics processors are increasingly used in scientific applications due to their high
computational power, which comes from hardware with multiple-level parallelism and …
computational power, which comes from hardware with multiple-level parallelism and …
An initial exploration of a multi-sensory design space: Tactile support for walking in immersive virtual environments
Multi-sensory feedback can potentially improve user experience and performance in virtual
environments. As it is complicated to study the effect of multi-sensory feedback as a single …
environments. As it is complicated to study the effect of multi-sensory feedback as a single …
GPU-acceleration for moving particle semi-implicit method
The MPS (Moving Particle Semi-implicit) method has been proven useful in computation free-
surface hydrodynamic flows. Despite its applicability, one of its drawbacks in practical …
surface hydrodynamic flows. Despite its applicability, one of its drawbacks in practical …
Fast parallel Markov clustering in bioinformatics using massively parallel computing on GPU with CUDA and ELLPACK-R sparse format
Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining
clusters in networks. However, with increasing vast amount of data on biological networks …
clusters in networks. However, with increasing vast amount of data on biological networks …
A memory efficient and fast sparse matrix vector product on a GPU
This paper proposes a new sparse matrix storage format which allows an efficient
implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit …
implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit …
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach
A wide range of applications in engineering and scientific computing are involved in the
acceleration of the sparse matrix vector product (SpMV). Graphics Processing Units (GPUs) …
acceleration of the sparse matrix vector product (SpMV). Graphics Processing Units (GPUs) …