Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture
Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …
societies. High-performance computing (HPC) provides the necessary computational power …
X-stream: Edge-centric graph processing using streaming partitions
X-Stream is a system for processing both in-memory and out-of-core graphs on a single
shared-memory machine. While retaining the scatter-gather programming model with state …
shared-memory machine. While retaining the scatter-gather programming model with state …
[書籍][B] Introduction to algorithms
A comprehensive update of the leading algorithms text, with new material on matchings in
bipartite graphs, online algorithms, machine learning, and other topics. Some books on …
bipartite graphs, online algorithms, machine learning, and other topics. Some books on …
Scheduling multithreaded computations by work stealing
RD Blumofe, CE Leiserson - Journal of the ACM (JACM), 1999 - dl.acm.org
This paper studies the problem of efficiently schedulling fully strict (ie, well-structured)
multithreaded computations on parallel computers. A popular and practical method of …
multithreaded computations on parallel computers. A popular and practical method of …
Analysis of multithreaded programs
M Rinard - International Static Analysis Symposium, 2001 - Springer
The field of program analysis has focused primarily on sequential programming languages.
But multithreading is becoming increasingly important, both as a program structuring …
But multithreading is becoming increasingly important, both as a program structuring …
HPCTOOLKIT: tools for performance analysis of optimized parallel programs
L Adhianto, S Banerjee, M Fagan… - Concurrency and …, 2010 - Wiley Online Library
HPCToolkit is an integrated suite of tools that supports measurement, analysis, attribution,
and presentation of application performance for both sequential and parallel programs …
and presentation of application performance for both sequential and parallel programs …
Evaluating mapreduce for multi-core and multiprocessor systems
C Ranger, R Raghuraman, A Penmetsa… - 2007 IEEE 13th …, 2007 - ieeexplore.ieee.org
This paper evaluates the suitability of the MapReduce model for multi-core and multi-
processor systems. MapReduce was created by Google for application development on data …
processor systems. MapReduce was created by Google for application development on data …
[書籍][B] Patterns for parallel programming
TG Mattson, B Sanders, B Massingill - 2004 - books.google.com
The Parallel Programming Guide for Every Software Developer From grids and clusters to
next-generation game consoles, parallel computing is going mainstream. Innovations such …
next-generation game consoles, parallel computing is going mainstream. Innovations such …
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks
This paper introduces a storage format for sparse matrices, called compressed sparse
blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is …
blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is …
{CRISP}: Critical path analysis of {Large-Scale} microservice architectures
Microservice architectures have become the lifeblood of modern service-oriented software
systems. Remote Procedure Calls (RPCs) among microservices are deeply nested …
systems. Remote Procedure Calls (RPCs) among microservices are deeply nested …