I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization

T Geng, C Wu, Y Zhang, C Tan, C **e, H You… - MICRO-54: 54th annual …, 2021 - dl.acm.org
Graph Convolutional Networks (GCNs) have drawn tremendous attention in the past three
years. Compared with other deep learning modalities, high-performance hardware …

The pyramid match kernel: Discriminative classification with sets of image features

K Grauman, T Darrell - … on Computer Vision (ICCV'05) Volume …, 2005 - ieeexplore.ieee.org
Discriminative learning is challenging when examples are sets of features, and the sets vary
in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods …

QGTC: accelerating quantized graph neural networks via GPU tensor core

Y Wang, B Feng, Y Ding - Proceedings of the 27th ACM SIGPLAN …, 2022 - dl.acm.org
Over the most recent years, quantized graph neural network (QGNN) attracts lots of research
and industry attention due to its high robustness and low computation and memory …

Rabbit order: Just-in-time parallel reordering for fast graph analysis

J Arai, H Shiokawa, T Yamamuro… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
Ahead-of-time data layout optimization by vertex reordering is a widely used technique to
improve memory access locality in graph analysis. While reordered graphs yield better …

When is graph reordering an optimization? studying the effect of lightweight graph reordering across applications and input graphs

V Balaji, B Lucia - 2018 IEEE International Symposium on …, 2018 - ieeexplore.ieee.org
Graph processing applications are notorious for exhibiting poor cache locality due to an
irregular memory access pattern. However, prior work on graph reordering has observed …

Traversing large graphs on GPUs with unified memory

P Gera, H Kim, P Sao, H Kim, D Bader - Proceedings of the VLDB …, 2020 - dl.acm.org
Due to the limited capacity of GPU memory, the majority of prior work on graph applications
on GPUs has been restricted to graphs of modest sizes that fit in memory. Recent hardware …

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees

Y Ji, H Liu, Y Hu, HH Huang - ACM Transactions on Parallel Computing, 2022 - dl.acm.org
Detecting strongly connected components (SCCs) in a directed graph is crucial for
understanding the structure of graphs. Most real-world graphs have one large SCC that …

Scalable computation of anisotropic vibrations for large macromolecular assemblies

JH Lam, A Nakano, V Katritch - Nature Communications, 2024 - nature.com
Abstract The Normal Mode Analysis (NMA) is a standard approach to elucidate the
anisotropic vibrations of macromolecules at their folded states, where low-frequency …

A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives

Z Lv, M Yan, X Liu, M Dong, X Ye, D Fan… - arxiv preprint arxiv …, 2023 - arxiv.org
Graph-related applications have experienced significant growth in academia and industry,
driven by the powerful representation capabilities of graph. However, efficiently executing …

The reverse Cuthill-McKee algorithm in distributed-memory

A Azad, M Jacquelin, A Buluç… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct
solvers, maximize locality in iterative solvers, and improve performance in graph algorithms …