Mosaic: Processing a trillion-edge graph on a single machine
Processing a one trillion-edge graph has recently been demonstrated by distributed graph
engines running on clusters of tens to hundreds of nodes. In this paper, we employ a single …
engines running on clusters of tens to hundreds of nodes. In this paper, we employ a single …
Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics
This paper introduces a new approach to building distributed-memory graph analytics
systems that exploits heterogeneity in processor types (CPU and GPU), partitioning policies …
systems that exploits heterogeneity in processor types (CPU and GPU), partitioning policies …
Mariusgnn: Resource-efficient out-of-core training of graph neural networks
We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the
premise of using distributed training for billion-scale graphs and show that for graphs that fit …
premise of using distributed training for billion-scale graphs and show that for graphs that fit …
Automatic cricket highlight generation using event-driven and excitement-based features
Producing sports highlights is a labor-intensive work that requires some degree of
specialization. We propose a model capable of automatically generating sports highlights …
specialization. We propose a model capable of automatically generating sports highlights …
TurboGraph++ A scalable and fast graph analytics system
Existing distributed graph analytics systems are categorized into two main groups: those that
focus on efficiency with a risk of out-of-memory error and those that focus on scale-up with a …
focus on efficiency with a risk of out-of-memory error and those that focus on scale-up with a …
The computational sprinting game
Computational sprinting is a class of mechanisms that boost performance but dissipate
additional power. We describe a sprinting architecture in which many, independent chip …
additional power. We describe a sprinting architecture in which many, independent chip …
Parallel strong connectivity based on faster reachability
Computing strongly connected components (SCC) is among the most fundamental problems
in graph analytics. Given the large size of today's real-world graphs, parallel SCC …
in graph analytics. Given the large size of today's real-world graphs, parallel SCC …
Speeding up SpMV for power-law graph analytics by enhancing locality & vectorization
Graph analytics applications often target large-scale web and social networks, which are
typically power-law graphs. Graph algorithms can often be recast as generalized Sparse …
typically power-law graphs. Graph algorithms can often be recast as generalized Sparse …
Gluon-async: A bulk-asynchronous system for distributed and heterogeneous graph analytics
Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like
D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model …
D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model …
High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems
Semisort is a fundamental algorithmic primitive widely used in the design and analysis of
efficient parallel algorithms. It takes input as an array of records and a function extracting a …
efficient parallel algorithms. It takes input as an array of records and a function extracting a …