I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization
Graph Convolutional Networks (GCNs) have drawn tremendous attention in the past three
years. Compared with other deep learning modalities, high-performance hardware …
years. Compared with other deep learning modalities, high-performance hardware …
The pyramid match kernel: Discriminative classification with sets of image features
Discriminative learning is challenging when examples are sets of features, and the sets vary
in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods …
in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods …
QGTC: accelerating quantized graph neural networks via GPU tensor core
Over the most recent years, quantized graph neural network (QGNN) attracts lots of research
and industry attention due to its high robustness and low computation and memory …
and industry attention due to its high robustness and low computation and memory …
Rabbit order: Just-in-time parallel reordering for fast graph analysis
Ahead-of-time data layout optimization by vertex reordering is a widely used technique to
improve memory access locality in graph analysis. While reordered graphs yield better …
improve memory access locality in graph analysis. While reordered graphs yield better …
When is graph reordering an optimization? studying the effect of lightweight graph reordering across applications and input graphs
Graph processing applications are notorious for exhibiting poor cache locality due to an
irregular memory access pattern. However, prior work on graph reordering has observed …
irregular memory access pattern. However, prior work on graph reordering has observed …
Traversing large graphs on GPUs with unified memory
Due to the limited capacity of GPU memory, the majority of prior work on graph applications
on GPUs has been restricted to graphs of modest sizes that fit in memory. Recent hardware …
on GPUs has been restricted to graphs of modest sizes that fit in memory. Recent hardware …
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Detecting strongly connected components (SCCs) in a directed graph is crucial for
understanding the structure of graphs. Most real-world graphs have one large SCC that …
understanding the structure of graphs. Most real-world graphs have one large SCC that …
Scalable computation of anisotropic vibrations for large macromolecular assemblies
Abstract The Normal Mode Analysis (NMA) is a standard approach to elucidate the
anisotropic vibrations of macromolecules at their folded states, where low-frequency …
anisotropic vibrations of macromolecules at their folded states, where low-frequency …
A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives
Graph-related applications have experienced significant growth in academia and industry,
driven by the powerful representation capabilities of graph. However, efficiently executing …
driven by the powerful representation capabilities of graph. However, efficiently executing …
The reverse Cuthill-McKee algorithm in distributed-memory
Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct
solvers, maximize locality in iterative solvers, and improve performance in graph algorithms …
solvers, maximize locality in iterative solvers, and improve performance in graph algorithms …