I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization

T Geng, C Wu, Y Zhang, C Tan, C **e, H You… - MICRO-54: 54th annual …, 2021 - dl.acm.org
Graph Convolutional Networks (GCNs) have drawn tremendous attention in the past three
years. Compared with other deep learning modalities, high-performance hardware …

Sisa: Set-centric instruction set architecture for graph mining on processing-in-memory systems

M Besta, R Kanakagiri, G Kwasniewski… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Simple graph algorithms such as PageRank have been the target of numerous hardware
accelerators. Yet, there also exist much more complex graph mining algorithms for problems …

Graphit: A high-performance graph dsl

Y Zhang, M Yang, R Baghdadi, S Kamil… - Proceedings of the …, 2018 - dl.acm.org
The performance bottlenecks of graph applications depend not only on the algorithm and
the underlying hardware, but also on the size and structure of the input graph. As a result …

Exploiting locality in graph analytics through hardware-accelerated traversal scheduling

A Mukkara, N Beckmann, M Abeydeera… - 2018 51st Annual …, 2018 - ieeexplore.ieee.org
Graph processing is increasingly bottlenecked by main memory accesses. On-chip caches
are of little help because the irregular structure of graphs causes seemingly random memory …

Featgraph: A flexible and efficient backend for graph neural network systems

Y Hu, Z Ye, M Wang, J Yu, D Zheng, M Li… - … Conference for High …, 2020 - ieeexplore.ieee.org
Graph neural networks (GNNs) are gaining popularity as a promising approach to machine
learning on graphs. Unlike traditional graph workloads where each vertex/edge is …

GraphLily: Accelerating graph linear algebra on HBM-equipped FPGAs

Y Hu, Y Du, E Ustun, Z Zhang - 2021 IEEE/ACM International …, 2021 - ieeexplore.ieee.org
Graph processing is typically memory bound due to low compute to memory access ratio
and irregular data access pattern. The emerging high-bandwidth memory (HBM) delivers …

Terrace: A hierarchical graph container for skewed dynamic graphs

P Pandey, B Wheatman, H Xu, A Buluc - Proceedings of the 2021 …, 2021 - dl.acm.org
Various applications model problems as streaming graphs, which need to quickly apply a
stream of updates and run algorithms on the updated graph. Furthermore, many dynamic …

PHI: Architectural support for synchronization-and bandwidth-efficient commutative scatter updates

A Mukkara, N Beckmann, D Sanchez - … of the 52nd Annual IEEE/ACM …, 2019 - dl.acm.org
Many applications perform frequent scatter update operations to large data structures. For
example, in push-style graph algorithms, processing each vertex requires updating the data …

When is graph reordering an optimization? studying the effect of lightweight graph reordering across applications and input graphs

V Balaji, B Lucia - 2018 IEEE International Symposium on …, 2018 - ieeexplore.ieee.org
Graph processing applications are notorious for exhibiting poor cache locality due to an
irregular memory access pattern. However, prior work on graph reordering has observed …

Traversing large graphs on GPUs with unified memory

P Gera, H Kim, P Sao, H Kim, D Bader - Proceedings of the VLDB …, 2020 - dl.acm.org
Due to the limited capacity of GPU memory, the majority of prior work on graph applications
on GPUs has been restricted to graphs of modest sizes that fit in memory. Recent hardware …