Dadiannao: A machine-learning supercomputer
Many companies are deploying services, either for consumers or industry, which are largely
based on machine-learning algorithms for sophisticated processing of large amounts of …
based on machine-learning algorithms for sophisticated processing of large amounts of …
GraphP: Reducing communication for PIM-based graph processing with efficient data partition
Processing-In-Memory (PIM) is an effective technique that reduces data movements by
integrating processing units within memory. The recent advance of “big data” and 3D …
integrating processing units within memory. The recent advance of “big data” and 3D …
Graphq: Scalable pim-based graph processing
Processing-In-Memory (PIM) architectures based on recent technology advances (eg,
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …
DaDianNao: A neural network supercomputer
T Luo, S Liu, L Li, Y Wang, S Zhang… - IEEE Transactions …, 2016 - ieeexplore.ieee.org
Many companies are deploying services largely based on machine-learning algorithms for
sophisticated processing of large amounts of data, either for consumers or industry. The …
sophisticated processing of large amounts of data, either for consumers or industry. The …
Application map** onto mesh-based network-on-chip using discrete particle swarm optimization
This paper presents a discrete particle swarm optimization (PSO)-based strategy to map
applications on both 2-D and 3-D mesh-connected Networks-on-Chip. The basic PSO …
applications on both 2-D and 3-D mesh-connected Networks-on-Chip. The basic PSO …
A survey on memory-centric computer architectures
Faster and cheaper computers have been constantly demanding technological and
architectural improvements. However, current technology is suffering from three technology …
architectural improvements. However, current technology is suffering from three technology …
Spara: An energy-efficient ReRAM-based accelerator for sparse graph analytics applications
Resistive random access memory (ReRAM) addresses the high memory bandwidth
requirement challenge of graph analytics by integrating the computing logic in the memory …
requirement challenge of graph analytics by integrating the computing logic in the memory …
Multiresolution tomographic reconstruction using wavelets
AH Delaney, Y Bresler - IEEE Transactions on image …, 1995 - ieeexplore.ieee.org
Shows how the separable two-dimensional wavelet representation leads naturally to an
efficient multiresolution tomographic reconstruction algorithm. This algorithm is similar to the …
efficient multiresolution tomographic reconstruction algorithm. This algorithm is similar to the …
A 340 mV-to-0.9 V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16× 16 network-on-chip in 22 nm tri-gate CMOS
G Chen, MA Anders, H Kaul… - IEEE Journal of Solid …, 2014 - ieeexplore.ieee.org
A 16× 16 mesh network-on-chip (NoC) is fabricated in 22 nm tri-gate CMOS for high-
throughput, energy-efficient on-chip interconnect in multi-core processors and systems-on …
throughput, energy-efficient on-chip interconnect in multi-core processors and systems-on …
Up by their bootstraps: Online learning in artificial neural networks for CMP uncore power management
With increasing core counts in Chip Multi-Processor (CMP) designs, the size of the on-chip
communication fabric and shared Last-Level Caches (LLC), which we term uncore here, is …
communication fabric and shared Last-Level Caches (LLC), which we term uncore here, is …