[書籍][B] Structured parallel programming: patterns for efficient computation
M McCool, J Reinders, A Robison - 2012 - books.google.com
Structured Parallel Programming offers the simplest way for developers to learn patterns for
high-performance parallel programming. Written by parallel computing experts and industry …
high-performance parallel programming. Written by parallel computing experts and industry …
A survey on parallelism and determinism
Parallelism is often required for performance. In these situations an excess of non-
determinism is harmful as it means the program can have several different behaviours or …
determinism is harmful as it means the program can have several different behaviours or …
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks
This paper introduces a storage format for sparse matrices, called compressed sparse
blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is …
blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is …
The Cilk++ concurrency platform
CE Leiserson - Proceedings of the 46th Annual Design Automation …, 2009 - dl.acm.org
The availability of multicore processors across a wide range of computing platforms has
created a strong demand for software frameworks that can harness these resources. This …
created a strong demand for software frameworks that can harness these resources. This …
A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)
We have developed a multithreaded implementation of breadth-first search (BFS) of a
sparse graph using the Cilk++ extensions to C++. Our PBFS program on a single processor …
sparse graph using the Cilk++ extensions to C++. Our PBFS program on a single processor …
The Cilkview scalability analyzer
The Cilkview scalability analyzer is a software tool for profiling, estimating scalability, and
benchmarking multithreaded Cilk++ applications. Cilkview monitors logical parallelism …
benchmarking multithreaded Cilk++ applications. Cilkview monitors logical parallelism …
Graphgrind: Addressing load imbalance of graph partitioning
We investigate how graph partitioning adversely affects the performance of graph analytics.
We demonstrate that graph partitioning induces extra work during graph traversal and that …
We demonstrate that graph partitioning induces extra work during graph traversal and that …
Composable parallel patterns with intel cilk plus
AD Robison - Computing in Science & Engineering, 2013 - computer.org
COMPOSABLE PARALLEL PATTERNS WITH INTEL CILK PLUS Page 1 66 Copublished by
the IEEE CS and the AIP 1521-9615/13/$31.00 © 2013 IEEE COMPUTING IN SCIENCE & …
the IEEE CS and the AIP 1521-9615/13/$31.00 © 2013 IEEE COMPUTING IN SCIENCE & …
Concurrent programming with revisions and isolation types
Building applications that are responsive and can exploit parallel hardware while remaining
simple to write, understand, test, and maintain, poses an important challenge for developers …
simple to write, understand, test, and maintain, poses an important challenge for developers …