Space efficient sequence alignment for sram-based computing: X-drop on the graphcore IPU

L Burchard, MX Zhao, J Langguth, A Buluç… - Proceedings of the …, 2023 - dl.acm.org
Dedicated accelerator hardware has become essential for processing AI-based workloads,
leading to the rise of novel accelerator architectures. Furthermore, fundamental differences …

Massive data-centric parallelism in the chiplet era

M Orenes-Vera, E Tureci, D Wentzlaff… - arxiv preprint arxiv …, 2023 - arxiv.org
Recent works have introduced task-based parallelization schemes to accelerate graph
search and sparse data-structure traversal, where some solutions scale up to thousands of …

Exploiting deep learning accelerators for neuromorphic workloads

PSV Sun, A Titterton, A Gopiani, T Santos… - Neuromorphic …, 2024 - iopscience.iop.org
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms
of energy consumption and latency when performing inference with deep learning …

Intelligence processing units accelerate neuromorphic learning

PSV Sun, A Titterton, A Gopiani, T Santos… - arxiv preprint arxiv …, 2022 - arxiv.org
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms
of energy consumption and latency when performing inference with deep learning …

Performance Evaluation of Parallel Graphs Algorithms Utilizing Graphcore IPU

P Gepner, B Kocot, M Paprzycki, M Ganzha, L Moroz… - Electronics, 2024 - mdpi.com
Recent years have been characterized by increasing interest in graph computations. This
trend can be related to the large number of potential application areas. Moreover, increasing …

ipug for multiple graphcore ipus: Optimizing performance and scalability of parallel breadth-first search

L Burchard, X Cai, J Langguth - 2021 IEEE 28th International …, 2021 - ieeexplore.ieee.org
Parallel graph algorithms have become one of the principal applications of high-
performance computing besides numerical simulations and machine learning workloads …

Steering customized ai architectures for hpc scientific applications

H Ltaief, Y Hong, A Dabah, R Alomairy… - … Conference on High …, 2023 - Springer
AI hardware technologies have revolutionized computational science. While they have been
mostly used to accelerate deep learning training and inference models for machine learning …

Tascade: Hardware support for atomic-free, asynchronous and efficient reduction trees

M Orenes-Vera, E Tureci, D Wentzlaff… - arxiv preprint arxiv …, 2023 - arxiv.org
Graph search and sparse data-structure traversal workloads contain challenging irregular
memory patterns on global data structures that need to be modified atomically. Distributed …

Implementing spatio-temporal graph convolutional networks on graphcore ipus

J Moe, K Pogorelov, DT Schroeder… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
Artificial neural networks have been used for a multitude of regression tasks, and their
descendants have expanded the domain to many applications such as image and speech …

Enabling unstructured-mesh computation on massively tiled AI processors: An example of accelerating in silico cardiac simulation

L Burchard, KG Hustad, J Langguth, X Cai - Frontiers in Physics, 2023 - frontiersin.org
A new trend in processor architecture design is the packaging of thousands of small
processor cores into a single device, where there is no device-level shared memory but …