Dadiannao: A machine-learning supercomputer

Y Chen, T Luo, S Liu, S Zhang, L He… - 2014 47th Annual …, 2014 - ieeexplore.ieee.org
Many companies are deploying services, either for consumers or industry, which are largely
based on machine-learning algorithms for sophisticated processing of large amounts of …

GraphP: Reducing communication for PIM-based graph processing with efficient data partition

M Zhang, Y Zhuo, C Wang, M Gao, Y Wu… - … Symposium on High …, 2018 - ieeexplore.ieee.org
Processing-In-Memory (PIM) is an effective technique that reduces data movements by
integrating processing units within memory. The recent advance of “big data” and 3D …

Graphq: Scalable pim-based graph processing

Y Zhuo, C Wang, M Zhang, R Wang, D Niu… - Proceedings of the …, 2019 - dl.acm.org
Processing-In-Memory (PIM) architectures based on recent technology advances (eg,
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …

DaDianNao: A neural network supercomputer

T Luo, S Liu, L Li, Y Wang, S Zhang… - IEEE Transactions …, 2016 - ieeexplore.ieee.org
Many companies are deploying services largely based on machine-learning algorithms for
sophisticated processing of large amounts of data, either for consumers or industry. The …

Application map** onto mesh-based network-on-chip using discrete particle swarm optimization

PK Sahu, T Shah, K Manna… - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
This paper presents a discrete particle swarm optimization (PSO)-based strategy to map
applications on both 2-D and 3-D mesh-connected Networks-on-Chip. The basic PSO …

A survey on memory-centric computer architectures

A Gebregiorgis, HA Du Nguyen, J Yu… - ACM Journal on …, 2022 - dl.acm.org
Faster and cheaper computers have been constantly demanding technological and
architectural improvements. However, current technology is suffering from three technology …

Spara: An energy-efficient ReRAM-based accelerator for sparse graph analytics applications

L Zheng, J Zhao, Y Huang, Q Wang… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
Resistive random access memory (ReRAM) addresses the high memory bandwidth
requirement challenge of graph analytics by integrating the computing logic in the memory …

Multiresolution tomographic reconstruction using wavelets

AH Delaney, Y Bresler - IEEE Transactions on image …, 1995 - ieeexplore.ieee.org
Shows how the separable two-dimensional wavelet representation leads naturally to an
efficient multiresolution tomographic reconstruction algorithm. This algorithm is similar to the …

A 340 mV-to-0.9 V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16× 16 network-on-chip in 22 nm tri-gate CMOS

G Chen, MA Anders, H Kaul… - IEEE Journal of Solid …, 2014 - ieeexplore.ieee.org
A 16× 16 mesh network-on-chip (NoC) is fabricated in 22 nm tri-gate CMOS for high-
throughput, energy-efficient on-chip interconnect in multi-core processors and systems-on …

Up by their bootstraps: Online learning in artificial neural networks for CMP uncore power management

JY Won, X Chen, P Gratz, J Hu… - 2014 IEEE 20th …, 2014 - ieeexplore.ieee.org
With increasing core counts in Chip Multi-Processor (CMP) designs, the size of the on-chip
communication fabric and shared Last-Level Caches (LLC), which we term uncore here, is …